Schema markup for GEO: which structured data types actually get you cited by AI
A practical guide to the schema.org markup that drives AI citation — including which types matter, which don't, and how to implement them correctly for GEO.
Schema markup is where most GEO programmes either build their foundation — or quietly undermine it. Get it right and your content becomes structurally legible to every major AI retrieval pipeline. Get it wrong and you create entity confusion that can suppress citation even when your content quality is high.
This is the practical guide: which types matter, how to implement them correctly, and what to avoid.
Why schema is different for GEO than for SEO
In SEO, schema drives rich results: star ratings, FAQ accordions, sitelinks, breadcrumbs. The payoff is visual — you get a bigger, more attractive result in the SERP.
In GEO, schema drives comprehension. AI retrieval systems use schema to:
- Understand what type of content they're reading (Article, FAQPage, HowTo)
- Identify who produced it (author entity, organisation entity)
- Understand the relationships between entities (
sameAs,memberOf,worksFor) - Extract structured information for synthesis (FAQ pairs, steps, data points)
The visual output doesn't matter. The machine legibility does. An AI system that can parse your schema has a much higher probability of correctly attributing and citing your content.
The four schema types that drive the most GEO lift
1. FAQPage — the highest direct citation impact
FAQPage is the most powerful GEO schema type because it gives AI systems pre-formed question-answer pairs they can cite with precision.
Full implementation:
{
"@context": "https://schema.org",
"@type": "FAQPage",
"mainEntity": [
{
"@type": "Question",
"name": "What is Generative Engine Optimization?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Generative Engine Optimization (GEO) is the practice of optimising a brand's content, structure and entity presence so that AI answer engines like ChatGPT, Perplexity, Gemini and Claude cite it as a trusted source when answering user questions."
}
},
{
"@type": "Question",
"name": "How long does GEO take to show results?",
"acceptedAnswer": {
"@type": "Answer",
"text": "Initial citation lift in Perplexity typically appears within 1–2 weeks. ChatGPT and Gemini citation builds over 4–12 weeks after structured data and content updates are deployed. Compounding visibility typically takes 3–6 months."
}
}
]
}
GEO best practices for FAQPage:
- Each answer should be a complete, self-contained paragraph — not a fragment
- Aim for 4–8 questions per page (more dilutes signal; fewer underuses the format)
- Use the exact questions your customers type into AI engines (not keyword-optimised variants)
- Answers should be 40–120 words: long enough to be citable, short enough to be lifted whole
2. Article — the authorship and freshness signal
Article schema establishes the content type, authorship and freshness — three signals AI systems use to weight citation probability.
Full implementation:
{
"@context": "https://schema.org",
"@type": "Article",
"headline": "Schema markup for GEO",
"description": "A practical guide to structured data for AI citation.",
"datePublished": "2026-04-25",
"dateModified": "2026-04-25",
"author": {
"@type": "Person",
"name": "Marco Silva",
"url": "https://reach-geo.com/team/marco",
"sameAs": [
"https://www.linkedin.com/company/reach-geo"
]
},
"publisher": {
"@type": "Organization",
"name": "Reach GEO",
"url": "https://reach-geo.com",
"logo": {
"@type": "ImageObject",
"url": "https://reach-geo.com/logo.png"
}
},
"mainEntityOfPage": {
"@type": "WebPage",
"@id": "https://reach-geo.com/en/blog/schema-markup-for-geo"
}
}
Critical fields:
author.sameAs— links the author entity to their external presence (LinkedIn, Twitter, Wikidata). This is what makes the author a recognized entity to AI systems, not just a name string.datePublished+dateModified— AI systems, especially Perplexity, weight freshness. Always include both.publisher— must match your Organization schema exactly (same name, same URL)
3. Organization — the entity anchor
Organization is the most important schema for long-term GEO. Without it, AI systems struggle to build a stable entity model of your brand, which reduces citation attribution accuracy.
This schema goes on your homepage and, ideally, every page of your site.
{
"@context": "https://schema.org",
"@type": "Organization",
"name": "Reach GEO",
"alternateName": "Reach GEO",
"url": "https://reach-geo.com",
"logo": "https://reach-geo.com/logo.png",
"description": "Portugal's first Generative Engine Optimization agency, helping European brands become cited by ChatGPT, Perplexity, Gemini and Claude.",
"foundingDate": "2025",
"address": {
"@type": "PostalAddress",
"addressLocality": "Lisbon",
"addressCountry": "PT"
},
"contactPoint": {
"@type": "ContactPoint",
"contactType": "Customer Service",
"email": "info@reach-geo.com"
},
"sameAs": [
"https://www.linkedin.com/company/reach-geo",
"https://twitter.com/reachgeo",
"https://www.wikidata.org/wiki/QXXXXXXX"
]
}
The sameAs field is the most underused property in GEO. It tells AI systems that reach-geo.com, linkedin.com/company/reach-geo, and wikidata.org/wiki/QXXXXXXX all refer to the same real-world entity. Without this, each source is treated as a separate, unrelated signal. With it, authority aggregates.
4. HowTo — for process and guide content
If you publish step-by-step guides — which you should, for GEO — HowTo schema turns them into structured extraction targets.
{
"@context": "https://schema.org",
"@type": "HowTo",
"name": "How to get cited by ChatGPT in 30 days",
"step": [
{
"@type": "HowToStep",
"name": "Allow GPTBot in robots.txt",
"text": "Add 'User-agent: GPTBot / Allow: /' to your robots.txt file to ensure OpenAI's crawler can access your content."
},
{
"@type": "HowToStep",
"name": "Clean your entity data",
"text": "Create or claim your Wikidata entry and add Organization schema with sameAs links to LinkedIn, Wikidata and other profiles."
}
]
}
HowTo content is disproportionately cited by AI engines because it's structurally easy to extract as a numbered list — which is exactly the format users want when they ask "how do I…" questions.
Secondary schema types worth implementing
| Schema type | When to use | GEO impact |
|---|---|---|
BreadcrumbList | All pages | Medium — helps topical structure |
WebSite with SearchAction | Homepage | Low-medium |
Person | Author pages | High — builds author entity |
Product | E-commerce, SaaS | High for product queries |
Review / AggregateRating | Product pages | High for comparison queries |
Event | Events, webinars | Medium — for event-based queries |
VideoObject | Video content | Medium — for video queries |
The five most common schema mistakes that hurt GEO
1. Conflicting entity names — Your Organization schema says "Reach GEO" but your author pages say "Reach GEO" (capitalised differently). AI systems treat these as potentially different entities. Consistency is non-negotiable.
2. Missing sameAs — The most common omission. Without sameAs, your entity is an island. Every platform where you exist (LinkedIn, Wikidata, Crunchbase, etc.) should be linked via sameAs in your Organization and Person schemas.
3. Nested schema errors — JSON-LD must be valid JSON. A single misplaced comma or bracket breaks the entire block silently. Use the Rich Results Test to validate after every change.
4. FAQPage answers that are too short — A one-word or one-sentence answer provides almost no citation value. Each answer should be a complete, citable paragraph.
5. Schema only on the homepage — The homepage is not where your citable content lives. Article and FAQPage schema must be on every blog post, service page and resource that you want cited.
Implementation priority order
If you're starting from scratch, implement in this order:
- Organization schema on every page — your entity anchor
- FAQPage schema on your top 5 most-trafficked informational pages
- Article schema on every blog post
- Person schema on every author page
- HowTo schema on any step-by-step guide content
- BreadcrumbList on all pages for topical structure
Budget: 3–5 days for a developer to implement across a standard 50-page site. For larger sites, prioritise by traffic and then by topic cluster.
Validating your implementation
After implementing schema:
- Google Rich Results Test — validates syntax and checks for rich result eligibility
- Schema.org validator — broader validation against the full vocabulary
- Perplexity test — search for your brand + key topics in Perplexity; see if schema is enabling cleaner citations
- Manual AI probe — ask ChatGPT and Claude the questions your schema answers; check if they cite you
Give each engine 2–4 weeks after implementation to re-crawl and re-index before drawing conclusions.
If you want an expert schema audit of your site — including entity graph mapping and AI crawler access review — book a free GEO audit. We'll show you exactly what AI systems currently see when they read your pages.
People also ask
Does schema markup directly cause AI to cite you?
Schema doesn't directly cause citation — it makes citation more likely by making your content structurally legible to AI retrieval systems. Think of schema as removing friction: without it, AI systems may understand your content; with it, they definitely do.
Which schema type is most important for GEO?
FAQPage delivers the most direct citation lift because it gives AI systems pre-formatted question-answer pairs they can lift almost verbatim. Organization schema is the foundation without which entity attribution fails.
Should I use JSON-LD or microdata for schema?
JSON-LD in every case. It's the format recommended by Google, supported by all AI crawlers, and easy to maintain separately from HTML. Microdata is legacy; RDFa is niche. Stick to JSON-LD.
How do I test if my schema is working?
Use Google's Rich Results Test (search.google.com/test/rich-results) and Schema.org's validator (validator.schema.org). Also check your Search Console coverage report for schema errors.