Methodology · BETA

How the analysis works.

Every Brandioz score is the direct sum of measurable signals extracted from your site's HTML. No hand-waving. Every point is traceable.

0
Score components
0
Training sites
0.00
LOO MAE
r=0.000
GT correlation
Plain EnglishWe check 15 things about your site — like whether it has a clear title, enough words, and whether an AI reading it progressively understands it better. Those signals feed into a score out of 100. We also run a machine-learning model (trained on 94 real sites) to cross-check the result. Hover any underlined term for a plain-English definition.
01
Score Architecture

Component weights

Max points per dimension — total 100

Plain EnglishStructure (how clearly the site identifies itself) carries the most weight at 33 pts. Content Depth and Semantic Quality are next at 22 pts each.
Structure33 pts
Content Depth22 pts
Semantic Quality22 pts
Value Proposition18 pts
Hierarchy5 pts

Signal sub-weights

Contribution of each signal to total score

TitleStructure
10pt
DescriptionStructure
10pt
Hero SectionStructure
8pt
HeadingsStructure
5pt
Word CountContent Depth
22pt
InformativenessSemantic Quality
12pt
ReadabilitySemantic Quality
4pt
Topic DiversitySemantic Quality
6pt
Value PropValue Proposition
18pt
HierarchyHierarchy
5pt
02
ML Model

A HistGradientBoosting model trained on 94 full-quality renders with pure Python ground truth labels — TF-IDF semantic consistency, information density, structural completeness, and schema coverage. Completely independent of the heuristic scorer.

Plain EnglishThink of it as two people independently grading your site. The heuristic is a rule-checker. The ML model is a trained judge. If they agree, we trust the blend. If they diverge by more than 15 points, the coherence engine steps in to resolve the conflict.

Production blend

score = 0.6 × ml + 0.4 × heuristic

6.16 pts

LOO MAE

leave-one-out CV

r = 0.692

LOO Pearson

on held-out sites

r = 0.843

GT Correlation

heuristic vs labels

r = 0.843

H ↔ GT corr

heuristic vs labels

Feature importance

What the model learned to prioritise

Plain EnglishThe model learned that how well an AI progressively understands the site matters most — more than metadata. deep_score, imm_score, and scroll_score together account for over half of all decisions.
deep_score
19.1%
imm_score
16.9%
scroll_score
16.1%
value_prop_confidence
14.6%
title_confidence
8.6%
heading_confidence
6.6%
intent_confidence
4.6%
semantic_density_norm
4.3%
word_count_norm
3.6%
sentence_length_score
3.5%
description_confidence
2.2%
Understanding curve
Intent & value
Structural signals

Curve scores (deep/imm/scroll) = 52.1% of decisions. AI comprehension requires progressive clarity, not just strong metadata.

Content depth curve

Points earned vs word count — climbs fast at first, then slows down

Plain EnglishShort pages are penalised heavily. A 300-word page earns 11/22 pts. You need 1,000 words to reach the max 22 pts. After that, adding more words doesn't help — the curve flattens.
03
PCA Latent Dimensions
Plain EnglishPCA takes the 15 raw signals and finds 4 combined dimensions that best explain why sites differ from each other. Think of it like 4 fundamental questions that separate great sites from weak ones. Your score on each tells you which one to fix first.
PC046.2%

content_richness

Does the site get clearer the more you read?

How deeply an AI can understand the site across progressive reading depths.

deep_score, scroll_score, imm_score

PC117.7%

explicitness

Does the site use enough plain written words to explain itself?

Volume and structure of explicit written content.

word_count_norm, heading_confidence

PC28.3%

structural_clarity

Are sentences clear and is the site's purpose obvious?

Clarity of intent and sentence-level readability.

intent_confidence, description_confidence, value_prop_confidence

PC37.4%

understanding_speed

How quickly can an AI 'get' what the site is about?

How fast an AI reaches a clear understanding.

sentence_length_score, scroll_score, imm_score

4 components → 79.6% variance explained

remaining 20.4% = site noise
content_richness 46.2%
explicitness 17.7%
structural_clarity 8.3%
understanding_speed 7.4%

Scree plot — how much each component adds

We chose 4 components because the 5th barely adds anything (5.1%). The elbow is at 4.

Plain EnglishEach bar shows how much more variation the next component explains. The jump from bar 4 to bar 5 is tiny — that's the elbow where we stop.

Feature loadings matrix

How much each signal pushes each dimension up (+) or down (−)

Plain EnglishA loading close to +1 means the signal strongly drives that dimension up. Close to −1 = drives it down. Near 0 = little effect. Example: deep_score has +0.496 on PC0 — it's the biggest driver of content_richness.
SignalPC0content_richnessPC1explicitnessPC2structural_clarityPC3understanding_speed
deep_score+0.496-0.014-0.168+0.014
scroll_score+0.489-0.037-0.244+0.080
imm_score+0.477-0.003-0.276+0.053
title_confidence+0.301-0.291+0.249-0.081
value_prop_confidence+0.262+0.145+0.334-0.262
description_confidence+0.189-0.338+0.360-0.195
heading_confidence+0.205+0.418-0.094-0.387
intent_confidence+0.106-0.029+0.636-0.092
sentence_length_score+0.102+0.313+0.286+0.659
word_count_norm-0.084+0.529+0.021-0.444
semantic_density_norm-0.147-0.473-0.193-0.295

Understanding curve scores (imm/scroll/deep) dominate PC0 with loadings > 0.47 — the strongest PCA dimension is driven entirely by progressive AI comprehension, not metadata.

04
Site Landscape
Plain EnglishThis maps training sites by their two biggest PCA dimensions. Sites to the right score high on content richness (an AI understands them deeply). Sites higher up score high on explicitness (lots of plain written words). Best sites are top-right.
High visibilityGT > 65
Mid visibilityGT 50–65
Low visibilityGT < 50
SiteHMLGT
Linear798181
Trigger857678
Stripe877070
Upstash756970
Together846869
Anthropic816566
Resend856565
Anyscale896464
Mistral825655
Retool885656
Webflow754343
Cal.com644342
OpenAI504444
Canva474242
Spotify293333

PC0 vs PC1 — training site positions

content_richness (x-axis) vs explicitness (y-axis)

Low PC0 = AI struggles even after full read. Low PC1 = site lacks plain explicit language. Both axes required for strong AI visibility.

Heuristic vs ML vs Ground Truth

15 sites across the score range — ML pulls inflated heuristic scores toward ground truth

Plain English Grey = rule-based score. Blue = ML score. Green = the correct answer. Wherever grey is much taller than green, the rule-checker was too generous.
Heuristic
ML
Ground Truth

Heuristic over-scores brand-forward sites (Anyscale +25, Stripe +17, Retool +32). ML corrects by weighting the understanding curve more heavily (52% of decisions).

05
Signal Comparison
Plain EnglishComparing Stripe (top scorer) vs Vercel (solid developer site). The bigger the shape, the better. Stripe's hero and description are near-perfect — that's why it scores higher despite Vercel having strong structural signals.

Stripe vs Vercel — signal radar

How two strong developer sites compare signal by signal

Stripe · GT 69.8
Vercel · GT 63.9

Stripe's hero and description are near-perfect. Vercel has solid structure but weaker progressive understanding — explaining the gap.

Dominant weakness distribution

Most common PCA weakness across 15 representative sites

Plain EnglishFor each site, we find their lowest PCA dimension — the one they need to fix most. understanding_speed is the most common problem across the training set.
explicitness
6 sites
understanding_speed
5 sites
structural_clarity
2 sites
content_richness
2 sites
explicitness
understanding_speed
structural_clarity
content_richness

understanding_speed is the most common weakness — most sites have decent structure but AI takes too long to reach clarity.

06
Signal Definitions
Plain EnglishEvery signal is extracted directly from your site's HTML — no guessing, no AI interpretation. Here's exactly what each one measures and why it matters.
title_confidenceStructure
0–1

Is the page title descriptive enough for an AI to know what you do?

Word count, product category words (platform, tool, ai), and benefit descriptors. H1 compensates if title is brand-only.

description_confidenceStructure
0–1

Does the meta description explain your product clearly in 50–160 chars?

Meta description scored for capability language, audience terms, and optimal length (50–160 chars).

heading_confidenceStructure
0–1

Do the headings form a logical hierarchy and cover diverse topics?

H1 presence, no level skips, plus semantic diversity via TF-IDF cosine across all headings.

hero_confidenceStructure
0–1

Does the hero section explain what you do, or is it just a tagline?

Hero section word count, category and benefit language. Penalises repetitive or nav-polluted text.

value_prop_confidenceValue Proposition
0–1

Can an AI identify your product type, who it's for, and why it's useful?

Pattern-matches product type, target audience, and benefits in hero and title (3× weight) and full page text.

high_info_ratioSemantic Quality
0–1

What fraction of sentences actually say something useful vs filler?

Fraction of sentences above informativeness threshold. Scored on action verbs, numbers, AI phrases, unique word ratio.

breadth_scoreContent Depth
0–1

Does the page cover multiple distinct topics, or just repeat the same idea?

Clustering on paragraph vectors to count distinct topic clusters, normalised to expected count for site type.

avg_sentence_lengthSemantic Quality
words

Are sentences the right length? Too short = choppy. Too long = hard to parse.

Mean words per sentence. Ideal 15–20 words. Gaussian penalty in both directions.

imm_scoreUnderstanding
0–100

First impression — how much does an AI understand from just the hero section?

BFS traversal score at hero/top-level depth.

scroll_scoreUnderstanding
0–100

Mid-read — does understanding improve as the AI reads further down the page?

BFS traversal score at mid-depth sections. Clarity after reading past the hero.

deep_scoreUnderstanding
0–100

Full-read — at peak comprehension, how well does an AI understand the site?

BFS traversal score at full depth. Single most important feature (19.1%).

No black boxes.

Every number in Brandioz is computable from the raw HTML of your site. The weights on this page are the exact weights in production.

If a score is worth trusting, you should be able to understand exactly how it was computed — signal by signal, weight by weight.

Production formula

score = (
  title_conf     × 10   # structure
  + desc_conf    × 10
  + hero_conf    ×  8
  + heading_conf ×  5
  + log(words, 22)      # content depth
  + info_score   × 12   # semantic quality
  + sent_score   ×  4
  + diversity    ×  6
  + value_prop   × 18   # value prop
  + hierarchy    ×  5
  - curve_penalty       # understanding curve
)

final = 0.6 × ml + 0.4 × heuristic
# coherence engine may override