A
- AI Answer Engine core
- An AI system that responds to queries with generated prose rather than a list of ranked links. Examples include Perplexity AI, ChatGPT, Claude, and Google AI Overviews. AI answer engines synthesize information from multiple sources and cite them inline. They are the primary target of GEO optimization.
- AI Crawler technical
- A bot that automatically fetches and processes web pages on behalf of an AI system. AI crawlers
read raw HTML without executing JavaScript. They discover pages through sitemaps, robots.txt
directives, and link following. Major AI crawlers include GPTBot (OpenAI), PerplexityBot
(Perplexity), ClaudeBot (Anthropic), Google-Extended (Google), and CCBot (Common Crawl).
See also: GPTBot, PerplexityBot, ClaudeBot
- AI Readability Score metric
- A quantitative score from 0 to 100 measuring how clearly and completely an AI crawler can understand a website. Brandioz scores across five dimensions: structural clarity, content depth, value proposition clarity, semantic quality, and crawlability. Sites scoring below 50 typically have partial render issues or clarity gaps.
- Answer-First Formatting strategy
- A content structure where the direct answer to the implied question appears in the opening paragraph, before any context, background, or caveats. AI extraction systems and retrieval engines are significantly more likely to cite content that leads with the answer. This is contrary to traditional editorial structure but essential for GEO.
C
- Citation Signal core
- Any attribute of a webpage that increases the probability of an AI system referencing it in a generated answer. Primary citation signals: factual specificity, structural clarity, entity consistency, crawlability, original data, and direct answer formatting. Different AI platforms weight these signals differently.
- ClaudeBot crawler
- Anthropic's web crawler, used to collect training data for Claude models. User agent string:
ClaudeBot. Reads raw HTML without JavaScript execution. Respects robots.txt. Allow via:User-agent: ClaudeBot / Allow: / - Cloaking violation
- The practice of serving different content to search engine or AI crawlers than to human visitors. Cloaking is a violation of Google's Webmaster Guidelines and can result in manual penalties and deindexing. It is distinct from deploying purpose-built static pages (like crawler profiles) that are accessible to both bots and humans but not linked from main navigation.
- Common Crawl infrastructure
- A nonprofit organization that maintains a publicly available archive of billions of web pages, updated monthly. Common Crawl data is one of the primary sources for AI model training datasets, including those used by OpenAI, Anthropic, and others. Being indexed by CCBot (Common Crawl's crawler) increases training data presence and long-term parametric citation rates.
- Crawler Profile Page tactic
- A purpose-built static HTML page (e.g. /crawler-profile.html) designed specifically for AI crawlers. It is publicly accessible but carries a noindex meta tag so it does not appear in Google search results. It contains full JSON-LD schema, semantic HTML structure, and complete factual content about a brand — providing AI crawlers with a reliable, JavaScript-free surface to read.
- Crawlability technical
- The degree to which AI crawlers can successfully discover and read a website's content. Crawlability is affected by robots.txt configuration, sitemap presence, server-side vs client-side rendering, and semantic HTML structure. A site can have excellent content and still have near-zero AI visibility if crawlability is poor.
E
- The degree to which AI systems recognize a brand as a legitimate, trustworthy source in a given
category. Built through consistent brand mentions on authoritative third-party sites, press
coverage, Wikipedia presence, and social media consistency. High entity authority is the primary
driver of citation rates on parametric-first platforms like ChatGPT.
See also: Parametric-first
- Entity Consistency strategy
- The practice of using identical brand name, description, category, and key claims across all web surfaces — website, social profiles, press mentions, and third-party directories. Inconsistency creates uncertainty in AI systems, which reduces citation confidence. Entity consistency is the foundational GEO requirement before any other optimization matters.
F
- Freshness Window metric
- The recency period within which content receives maximum citation weighting from retrieval-first
AI platforms. Research on Perplexity citation patterns indicates content updated within 30 days
receives substantially more citations than older content. Maintaining a publishing cadence and
updating lastmod timestamps in sitemaps directly impacts freshness scores.
See also: Retrieval-first, lastmod
G
- GEO — Generative Engine Optimization core
- The practice of making a website, brand, and content legible and citable to AI answer engines. GEO optimizes for citation signals rather than keyword rankings. Key GEO levers: crawlability, entity clarity, original data, structured content, JSON-LD schema, and entity authority. GEO is distinct from but complementary to traditional SEO.
- Google-Extended crawler
- Google's dedicated AI crawler, separate from Googlebot. Used to collect training data for
Google's Gemini models and to power Google AI Overviews. User agent string:
Google-Extended. Allow via:User-agent: Google-Extended / Allow: /. Blocking Google-Extended does not affect Google Search rankings but does affect AI Overview citation eligibility. - GPTBot crawler
- OpenAI's web crawler, used to collect training data for GPT models and to power real-time web
browsing in ChatGPT. User agent string:
GPTBot. Reads raw HTML without JavaScript execution. Discovers pages via sitemaps, Common Crawl, and link following. Allow via:User-agent: GPTBot / Allow: /
J
- JSON-LD Schema technical
- Structured machine-readable data embedded in HTML using
<script type="application/ld+json">tags. JSON-LD communicates entity information to AI crawlers in a standardized format regardless of how the page renders. Key schema types for GEO: Organization, SoftwareApplication, FAQPage, WebPage (with speakable), and BreadcrumbList. JSON-LD is parsed by all major AI crawlers.
L
- lastmod technical
- The
<lastmod>element in sitemap.xml entries indicating when a page was last modified. Retrieval-first platforms like Perplexity use lastmod as a freshness signal — pages with recent lastmod dates are re-crawled faster and prioritized in retrieval. Keeping lastmod accurate and updating it when content changes is a low-effort, high-impact GEO tactic. - llms.txt tactic
- A plain text file placed at the root of a website (yourdomain.com/llms.txt) providing a structured overview of a site's most important pages for AI systems. Functions as a navigation guide — telling AI crawlers where to start rather than forcing them to explore blindly. Does not directly affect search rankings or guarantee AI citations. Analogous to a sitemap for AI systems rather than search engines.
P
- Parametric-First AI Platform core
- An AI platform that answers queries primarily from knowledge encoded in model weights during
training, rather than live web retrieval. ChatGPT without web browsing is the primary example.
Parametric-first platforms cite 7–8 sources per answer on average and favor entity authority
over content freshness. Appearing in parametric-first answers requires long-term brand building
and training data presence.
See also: Retrieval-first, Entity authority
- Partial Render Problem technical
- A condition where AI crawlers receive fewer than 600 words from a page's initial HTML because the majority of content is rendered by client-side JavaScript. AI crawlers do not execute JavaScript, so they only read the raw HTML shell. Common on sites built with React, Vue, or Angular without server-side rendering. Symptoms: AI readability score below 50, low structure and content depth sub-scores. Fixes: SSR, SSG, or a static crawler profile page.
- PerplexityBot crawler
- Perplexity AI's web crawler, used to build and maintain the real-time index that powers
Perplexity's live search retrieval. User agent string:
PerplexityBot. Unlike GPTBot which crawls primarily for training data, PerplexityBot crawls to maintain a fresh index queried on every user request. Prioritizes recently updated content. Allow via:User-agent: PerplexityBot / Allow: /
R
- Retrieval-First AI Platform core
- An AI platform that triggers a live web search for every query before generating an answer.
Perplexity AI is the primary example. Retrieval-first platforms cite 21+ sources per answer on
average and heavily reward content freshness, direct answer formatting, and crawlability. A blog
post published today can appear in a Perplexity answer the same day.
See also: Parametric-first, Freshness window
S
- Speakable Schema technical
- A Schema.org property on WebPage entities that identifies which sections of a page are most suitable for AI extraction and text-to-speech synthesis. Uses CSS selectors to point AI systems toward the most citable content. Including speakable schema increases extraction probability from designated sections. Typically applied to H1 headings, introductory summaries, and key fact sections.
- Server-Side Rendering (SSR) technical
- A web rendering approach where the server generates fully populated HTML before sending it to the client. SSR solves the partial render problem because AI crawlers receive complete page content in the initial HTML response. Next.js, Nuxt, and similar frameworks support SSR natively. The GEO alternative to SSR is deploying a static HTML crawler profile page that provides equivalent content without requiring a full architectural change.
- Structured Data technical
- Machine-readable data embedded in web pages to communicate entity information in a standardized format. In GEO, structured data refers primarily to JSON-LD schema and HTML microdata attributes. AI crawlers parse structured data to build confident entity models regardless of page rendering method. The most GEO-relevant schema types are Organization, FAQPage, SoftwareApplication, and WebPage with speakable.