Downloadable AI brand analysis reports are now available.
Every analysis runs the same deterministic 15-step pipeline — from URL validation to AI narrative. No black boxes.
Parses the URL, strips deep paths and UTM params down to the homepage root, then runs a live DNS lookup to confirm the domain actually exists before any work begins.
inputFetches the raw HTML and distinguishes JS-rendered shells from pages with real body content — a critical split that prevents SPA homepages from scoring as if they were fully readable.
inputExtracts exactly what an AI crawler sees: title, meta description, hero section, headings, paragraphs, OG tags, structured data, and internal links. This is the raw material for everything downstream.
extractionBuilds a hierarchical map of the page and traverses it breadth-first to find where value proposition signals appear relative to the fold — key for the understanding curve.
extractionA Groq LLM call classifies the site's primary intent (e-commerce, SaaS, media, blog, etc.) with a confidence score. The intent label drives category-specific scoring weights downstream.
analysisAsync function that builds what AI 'believes' about the site — category, capabilities, target audience, and a confidence score — by cross-referencing title, meta, hero, and intent signals.
analysisMeasures how fast AI clarity builds as it reads deeper: first impression, after scrolling, full page. Produces a named shape — fast_clear, partial, thin, or flat — that feeds directly into scoring.
analysisComputes sentence length, semantic density, high and low info ratios, and concept count. Also scores signal confidence for title, meta description, headings, and value proposition.
analysisDetermines presence tier (high / medium / low / unknown) from web signals and training data indicators. Must run before coherence — it feeds the known-brand floor rule that prevents globally recognised brands from scoring too low.
analysisCalculates the raw score across five components: structure (33pts), content depth (22pts), semantic quality (22pts), value proposition (18pts), and hierarchy (5pts) — weighted by page type and extraction quality.
scoringProjects the full feature vector into four PCA dimensions: understanding depth, signal balance, content density, and value prop speed. Surfaces the dominant weakness — the dimension with the highest improvement leverage.
scoringApplies correction rules on top of the raw score: known-brand floors for high-presence sites, extraction quality caps, curve penalties, and cross-signal consistency checks. This produces the final published score.
scoringCompares against a static corpus of 91 analyzed sites. If the category is too niche for a meaningful static comparison, falls back to a Groq-generated dynamic peer set.
outputGenerates targeted fixes from heuristic rules and PCA weakness overrides, then filters out irrelevant ones based on page type and belief context — so every recommendation is specific, not generic.
outputThe final step. Groq receives the complete result and writes a plain-English summary — what's working, what isn't, and why — personalised to the site's actual score, category, and signals.
outputSelect a stage
Click any step to see exactly what it does and why it matters.
Pipeline totals
15
Total stages
3
LLM calls
4
Async ops
Tap any stage to expand