Matt Briggs | JSON-LD Crushes HTML in AI Retrieval Accuracy: A 360% Performance Boost

JSON-LD Crushes HTML in AI Retrieval Accuracy: A 360% Performance Boost

July 22, 2025

Most people optimizing AI chatbots and RAG systems focus on the model. But what if the real problem is your content format?

In a head-to-head test, structured content using JSON-LD outperformed standard HTML by more than 3.6x in information retrieval accuracy. The difference is so stark, it’s like comparing finding a book in a heap of books compared to using a card catalog that lists the location of every book.

This study reveals something simple but powerful: The way you store your content can make or break AI performance.

Why This Matters: Retrieval is the Backbone of AI Accuracy

AI systems like ChatGPT, Copilot, and Claude don’t magically “know” things. When you ask a question about your product documentation, your legal policies, or your internal processes, these models retrieve relevant content first, then generate a response based on what they pulled.

This is called Retrieval-Augmented Generation (RAG); and it works only as Ill as the quality of the content you feed it.

There’s just one problem: most enterprise content is published in HTML, which is made for browsers, not machines. What if I gave these systems content that was structured for them?

HTML vs JSON-LD: The Showdown

We compared 582 help articles published on Microsoft Learn in both HTML and JSON-LD formats. These Ire FAQs generated from the same source YAML files.

We ran the exact same retrieval pipeline using a modern vector database (Weaviate) and OpenAI’s embeddings.

Here’s what changed:

Only the format. Same source, same pipeline, same queries.
Two pipelines. One ingested HTML; the other ingested JSON-LD using the Schema.org FAQPage format.

Then I scored each on F1 Score; a balanced metric that combines precision and recall.

The Results: JSON-LD Delivers SQL-Like Accuracy

Format	Mean F1 Score
HTML	0.28
JSON-LD	0.99

JSON-LD Delivers SQL-Like Accuracy

The histogram showing the distribution of F1 Scores for two different data formats: HTML and JSON-LD. The HTML F1 scores are widely dispersed and generally low, with most values clustering below 0.5 and a mean of only 0.28. The JSON-LD F1 scores are tightly concentrated near 1.0, indicating almost perfect performance across most samples, with a mean of 0.99.

That’s a 360% increase in retrieval accuracy. JSON-LD scored nearly 1.0; the theoretical max, equivalent to querying a structured database.

HTML, on the other hand, produced noisy, fragmented, or irrelevant results. Why?

Why HTML Falls Apart in Retrieval

HTML is for humans. It tells browsers how to display content, not what it means.
Semantics get lost. Headers, paragraphs, and styling tags don’t encode relationships.
Chunking breaks context. Parsing HTML requires heuristics to break text into retrievable sections, which often introduces noise.

In short: HTML is a poor substitute for structured knowledge. It’s like scanning a PDF with OCR and hoping the AI figures it out.

Why JSON-LD Wins

JSON-LD is machine-readable by design. It uses:

Triples (subject-predicate-object) that make meaning explicit.
Schema.org vocabularies that AI can interpret consistently.
Built-in question-ansIr mappings that mirror what users ask.

When AI pulls from JSON-LD, it doesn’t guess; it knows.

It’s not just clean content. It’s optimized for vector search, knowledge graphs, and AI-native experiences.

Real-World Takeaways: How to Improve AI Retrieval Today

If you’re building chatbots, assistants, or any AI system using content as a backend, this study has direct implications:

Use structured formats like JSON-LD or Schema.org when possible.
Stop relying solely on post-processing AI tricks; clean your data at the source.
Rethink your content pipeline; Markdown → HTML is not AI-friendly.
If you manage a documentation site or FAQ, add JSON-LD today.

Try It Yourself

This study is open-source and reproducible.

GitHub Repository
Includes HTML and JSON-LD data, golden questions, and evaluation scripts.
Full write up and analysis of the study.

All you need is a laptop, VS Code, and an OpenAI key. No more guessing; see for yourself how structured data transforms AI.

Final Word

AI accuracy isn’t just about the model. It’s about the content you feed it.

If your AI keeps hallucinating or missing the mark, the problem might not be your fine-tuning; it might be your HTML.

Structure your content. The machines will thank you.