Methodology · BETA

How the analysis works.

Every Brandioz score is the direct sum of measurable signals extracted from your site's HTML. No hand-waving. Every point is traceable.

Score components

Training sites

0.00

LOO MAE

r=0.000

GT correlation

Plain EnglishWe check 15 things about your site — like whether it has a clear title, enough words, and whether an AI reading it progressively understands it better. Those signals feed into a score out of 100. We also run a machine-learning model (trained on 94 real sites) to cross-check the result. Hover any underlined term for a plain-English definition.

Score Architecture

Component weights

Max points per dimension — total 100

Plain EnglishStructure (how clearly the site identifies itself) carries the most weight at 33 pts. Content Depth and Semantic Quality are next at 22 pts each.

Structure33 pts

Content Depth22 pts

Semantic Quality22 pts

Value Proposition18 pts

Hierarchy5 pts

Signal sub-weights

Contribution of each signal to total score

TitleStructure

10pt

DescriptionStructure

10pt

Hero SectionStructure

8pt

HeadingsStructure

5pt

Word CountContent Depth

22pt

InformativenessSemantic Quality

12pt

ReadabilitySemantic Quality

4pt

Topic DiversitySemantic Quality

6pt

Value PropValue Proposition

18pt

HierarchyHierarchy

5pt

ML Model

A HistGradientBoosting model trained on 94 full-quality renders with pure Python ground truth labels — TF-IDF semantic consistency, information density, structural completeness, and schema coverage. Completely independent of the heuristic scorer.

Plain EnglishThink of it as two people independently grading your site. The heuristic is a rule-checker. The ML model is a trained judge. If they agree, we trust the blend. If they diverge by more than 15 points, the coherence engine steps in to resolve the conflict.

Production blend

score = 0.6 × ml + 0.4 × heuristic

6.16 pts

LOO MAE

leave-one-out CV

r = 0.692

LOO Pearson

on held-out sites

r = 0.843

GT Correlation

heuristic vs labels

r = 0.843

H ↔ GT corr

heuristic vs labels

Feature importance

What the model learned to prioritise

Plain EnglishThe model learned that how well an AI progressively understands the site matters most — more than metadata. deep_score, imm_score, and scroll_score together account for over half of all decisions.

deep_score

19.1%

imm_score

16.9%

scroll_score

16.1%

value_prop_confidence

14.6%

title_confidence

8.6%

heading_confidence

6.6%

intent_confidence

4.6%

semantic_density_norm

4.3%

word_count_norm

3.6%

sentence_length_score

3.5%

description_confidence

2.2%

Understanding curve

Intent & value

Structural signals

Curve scores (deep/imm/scroll) = 52.1% of decisions. AI comprehension requires progressive clarity, not just strong metadata.

Content depth curve

Points earned vs word count — climbs fast at first, then slows down

Plain EnglishShort pages are penalised heavily. A 300-word page earns 11/22 pts. You need 1,000 words to reach the max 22 pts. After that, adding more words doesn't help — the curve flattens.

PCA Latent Dimensions

Plain EnglishPCA takes the 15 raw signals and finds 4 combined dimensions that best explain why sites differ from each other. Think of it like 4 fundamental questions that separate great sites from weak ones. Your score on each tells you which one to fix first.

PC046.2%

content_richness

Does the site get clearer the more you read?

How deeply an AI can understand the site across progressive reading depths.

↑ deep_score, scroll_score, imm_score

PC117.7%

explicitness

Does the site use enough plain written words to explain itself?

Volume and structure of explicit written content.

↑ word_count_norm, heading_confidence

PC28.3%

structural_clarity

Are sentences clear and is the site's purpose obvious?

Clarity of intent and sentence-level readability.

↑ intent_confidence, description_confidence, value_prop_confidence

PC37.4%

understanding_speed

How quickly can an AI 'get' what the site is about?

How fast an AI reaches a clear understanding.

↑ sentence_length_score, scroll_score, imm_score

4 components → 79.6% variance explained

remaining 20.4% = site noise

content_richness 46.2%

explicitness 17.7%

structural_clarity 8.3%

understanding_speed 7.4%

Scree plot — how much each component adds

We chose 4 components because the 5th barely adds anything (5.1%). The elbow is at 4.

Plain EnglishEach bar shows how much more variation the next component explains. The jump from bar 4 to bar 5 is tiny — that's the elbow where we stop.

Feature loadings matrix

How much each signal pushes each dimension up (+) or down (−)

Plain EnglishA loading close to +1 means the signal strongly drives that dimension up. Close to −1 = drives it down. Near 0 = little effect. Example: deep_score has +0.496 on PC0 — it's the biggest driver of content_richness.

Signal	PC0content_richness	PC1explicitness	PC2structural_clarity	PC3understanding_speed
deep_score	+0.496	-0.014	-0.168	+0.014
scroll_score	+0.489	-0.037	-0.244	+0.080
imm_score	+0.477	-0.003	-0.276	+0.053
title_confidence	+0.301	-0.291	+0.249	-0.081
value_prop_confidence	+0.262	+0.145	+0.334	-0.262
description_confidence	+0.189	-0.338	+0.360	-0.195
heading_confidence	+0.205	+0.418	-0.094	-0.387
intent_confidence	+0.106	-0.029	+0.636	-0.092
sentence_length_score	+0.102	+0.313	+0.286	+0.659
word_count_norm	-0.084	+0.529	+0.021	-0.444
semantic_density_norm	-0.147	-0.473	-0.193	-0.295

Understanding curve scores (imm/scroll/deep) dominate PC0 with loadings > 0.47 — the strongest PCA dimension is driven entirely by progressive AI comprehension, not metadata.

Site Landscape

Plain EnglishThis maps training sites by their two biggest PCA dimensions. Sites to the right score high on content richness (an AI understands them deeply). Sites higher up score high on explicitness (lots of plain written words). Best sites are top-right.

High visibilityGT > 65

Mid visibilityGT 50–65

Low visibilityGT < 50

SiteHMLGT

Linear798181

Trigger857678

Stripe877070

Upstash756970

Together846869

Anthropic816566

Resend856565

Anyscale896464

Mistral825655

Retool885656

Webflow754343

Cal.com644342

OpenAI504444

Canva474242

Spotify293333

PC0 vs PC1 — training site positions

content_richness (x-axis) vs explicitness (y-axis)

Low PC0 = AI struggles even after full read. Low PC1 = site lacks plain explicit language. Both axes required for strong AI visibility.

Heuristic vs ML vs Ground Truth

15 sites across the score range — ML pulls inflated heuristic scores toward ground truth

Plain English Grey = rule-based score. Blue = ML score. Green = the correct answer. Wherever grey is much taller than green, the rule-checker was too generous.

Heuristic

Ground Truth

Heuristic over-scores brand-forward sites (Anyscale +25, Stripe +17, Retool +32). ML corrects by weighting the understanding curve more heavily (52% of decisions).

Signal Comparison

Plain EnglishComparing Stripe (top scorer) vs Vercel (solid developer site). The bigger the shape, the better. Stripe's hero and description are near-perfect — that's why it scores higher despite Vercel having strong structural signals.

Stripe vs Vercel — signal radar

How two strong developer sites compare signal by signal

Stripe · GT 69.8

Vercel · GT 63.9

Stripe's hero and description are near-perfect. Vercel has solid structure but weaker progressive understanding — explaining the gap.

Dominant weakness distribution

Most common PCA weakness across 15 representative sites

Plain EnglishFor each site, we find their lowest PCA dimension — the one they need to fix most. understanding_speed is the most common problem across the training set.

explicitness

6 sites

understanding_speed

5 sites

structural_clarity

2 sites

content_richness

2 sites

explicitness

understanding_speed

structural_clarity

content_richness

understanding_speed is the most common weakness — most sites have decent structure but AI takes too long to reach clarity.

Signal Definitions

Plain EnglishEvery signal is extracted directly from your site's HTML — no guessing, no AI interpretation. Here's exactly what each one measures and why it matters.

title_confidenceStructure

0–1

Is the page title descriptive enough for an AI to know what you do?

Word count, product category words (platform, tool, ai), and benefit descriptors. H1 compensates if title is brand-only.

description_confidenceStructure

0–1

Does the meta description explain your product clearly in 50–160 chars?

Meta description scored for capability language, audience terms, and optimal length (50–160 chars).

heading_confidenceStructure

0–1

Do the headings form a logical hierarchy and cover diverse topics?

H1 presence, no level skips, plus semantic diversity via TF-IDF cosine across all headings.

hero_confidenceStructure

0–1

Does the hero section explain what you do, or is it just a tagline?

Hero section word count, category and benefit language. Penalises repetitive or nav-polluted text.

value_prop_confidenceValue Proposition

0–1

Can an AI identify your product type, who it's for, and why it's useful?

Pattern-matches product type, target audience, and benefits in hero and title (3× weight) and full page text.

high_info_ratioSemantic Quality

0–1

What fraction of sentences actually say something useful vs filler?

Fraction of sentences above informativeness threshold. Scored on action verbs, numbers, AI phrases, unique word ratio.

breadth_scoreContent Depth

0–1

Does the page cover multiple distinct topics, or just repeat the same idea?

Clustering on paragraph vectors to count distinct topic clusters, normalised to expected count for site type.

avg_sentence_lengthSemantic Quality

words

Are sentences the right length? Too short = choppy. Too long = hard to parse.

Mean words per sentence. Ideal 15–20 words. Gaussian penalty in both directions.

imm_scoreUnderstanding

0–100

First impression — how much does an AI understand from just the hero section?

BFS traversal score at hero/top-level depth.

scroll_scoreUnderstanding

0–100

Mid-read — does understanding improve as the AI reads further down the page?

BFS traversal score at mid-depth sections. Clarity after reading past the hero.

deep_scoreUnderstanding

0–100

Full-read — at peak comprehension, how well does an AI understand the site?

BFS traversal score at full depth. Single most important feature (19.1%).

No black boxes.

Every number in Brandioz is computable from the raw HTML of your site. The weights on this page are the exact weights in production.

If a score is worth trusting, you should be able to understand exactly how it was computed — signal by signal, weight by weight.

Production formula

score = (
  title_conf     × 10   # structure
  + desc_conf    × 10
  + hero_conf    ×  8
  + heading_conf ×  5
  + log(words, 22)      # content depth
  + info_score   × 12   # semantic quality
  + sent_score   ×  4
  + diversity    ×  6
  + value_prop   × 18   # value prop
  + hierarchy    ×  5
  - curve_penalty       # understanding curve
)

final = 0.6 × ml + 0.4 × heuristic
# coherence engine may override