Clicks don’t count (and they never did)
For most of its history, SEO measured the wrong layer.
Not deliberately. Simply because it was the only layer we could see.
Rankings, impressions, clicks, visits, authority scores. These became the discipline’s core metrics because the search interface exposed them. They moved when we changed things. They could be graphed, compared, and reported. They gave us something to optimise.
But they were never measures of competitiveness. They were measurements of an interface.
Search results pages are a presentation layer. They show documents retrieved by a system that is attempting to approximate preference, reputation, and relevance. When we measure rankings or traffic, we are observing the behaviour of that surface rather than the forces beneath it.
That distinction is subtle, but it matters.
A ranking change tells you that the retrieval system reordered documents. It does not tell you whether the brand became more desirable, the product more competitive, or the market more interested. Traffic behaves similarly. A page receiving more visits might reflect improved relevance, stronger reputation signals, or simply the way Google chose to render the results page that day.
Even the industry’s favourite composite metrics were always a kind of confidence theatre. Authority scores compress complex signals into tidy numbers that look reassuringly scientific, despite the fact that they represent no real-world quantity.
None of this means those metrics were useless. They were pragmatic approximations, built on the limited visibility the search interface allowed. They helped an entire discipline emerge and mature.
But they quietly shaped how SEO thought about success. If rankings rose, the strategy must be working. If traffic grew, the brand must be winning.
In reality, we were optimising the part of the system we could see.
And for a long time, that was close enough.
AI didn’t break SEO. It exposed it.
That model held together for a long time because the interface behaved in ways that made those signals feel meaningful. If a page ranked higher, it generally did receive more clicks. If more people searched for a brand, branded traffic rose. The surface reflected reality just closely enough that we could pretend the two were the same thing.
AI systems make that pretence much harder to sustain.
Search engines are no longer evaluating documents in isolation and presenting a list for users to interpret. Increasingly, they are aggregating signals about entities, products, organisations, and reputations across the entire web, and synthesising those signals into answers, summaries, and recommendations.
And, as evaluation systems broaden, optimisation surfaces shrink.
Early search engines evaluated documents. That allowed optimisation to focus on pages, keywords, and links. The surface area for tactical wins was large because the evaluation system was relatively narrow.
AI systems evaluate entities across a much wider landscape. They aggregate signals about products, brands, customer experiences, reputation, and market behaviour across the web.
Instead of asking which document best matches a query, these systems ask which brands, products, or sources are credible enough to represent the answer. The retrieval layer still exists, but the decision is increasingly made before the interface ever renders a ranked list.
This has an uncomfortable implication for the way SEO has traditionally worked.
For years, it was possible to separate discoverability from desirability. A page could rank well simply because it was well structured, well optimised, or had more links. Users might not particularly prefer the brand behind it, but the retrieval system would surface the document anyway.
That separation was never a fundamental property of search. It was a quirk of how early retrieval systems worked.
AI systems collapse that gap.
When models aggregate signals from across the web, they are not evaluating pages as isolated artefacts. They are evaluating brands, products, experiences, and reputations. The question becomes less “which page should appear first?” and more “which entity deserves to represent this answer?”
In that world, discoverability without desirability stops working.
The uncomfortable truth is that SEO is not being displaced by AI. It is being forced to reconnect with the things search engines were always trying to measure in the first place.
The prompt tracking trap
Faced with this shift, the industry has done what it always does when the interface changes.
It built a new set of interface metrics.
Prompt tracking. LLM visibility monitoring. Share-of-answer dashboards. Entire tool categories have appeared to tell brands how often they are mentioned in AI responses, how prominently they appear, and how those appearances change over time.
The appeal is obvious. Prompt tracking feels like control.
These systems feel like the new rankings. Ask a question, record the answer, count the mentions. If your brand appears more often, you must be winning.
But this recreates the same mistake we just spent two decades making.
Prompt outputs are not stable demand units. They are the visible behaviour of an interface that sits on top of a far more complex decision process. Change the phrasing slightly and the answer changes. Ask the same question twice and you may receive a different response. Introduce a different model, a different retrieval layer, or a different training update and the entire output landscape shifts again.
The variance is not noise. It is a feature of the system.
That means prompt tracking tends to measure the volatility of the interface rather than the strength of the underlying signals. A brand appearing in an answer does not necessarily mean it is preferred, trusted, or even particularly relevant. It means the model happened to include it in that specific response under those specific conditions.
Inclusion is not preference.
This is the same trap rankings created. Observing the output of the interface feels like observing the system itself. The numbers move, charts update, dashboards fill with reassuring graphs. It looks like measurement.
But what it mostly captures is the behaviour of the presentation layer.
If the goal is to understand competitiveness in an environment where AI systems synthesise signals from across the web, then counting mentions in generated answers is just another way of measuring the surface.
And the surface is not where the decision is made.
Marketing science already solved this
Once you step away from the interface, the measurement problem starts to look much less mysterious.
Marketing has been wrestling with it for decades.
The central insight from the Ehrenberg-Bass institute, and from Byron Sharp’s work in particular, is that growth does not primarily come from persuasion or optimisation. It comes from availability. Brands grow when they are easy to notice, easy to recall, and easy to buy.
Sharp describes this in two dimensions: mental availability and physical availability.
Mental availability is the likelihood that a brand comes to mind in buying situations. Physical availability is the likelihood that the brand is actually present where those buying situations occur.
Most marketing measurement frameworks orbit those ideas in one way or another. Brand salience, distribution coverage, share of search, distinctive asset recognition, market penetration. Different industries use different language, but they are usually describing the same underlying mechanics.
None of this is new. What is new is that AI systems operationalise these signals in ways the search interface never did.
Large language models do not retrieve pages purely because the pages are technically well structured. They synthesise signals about which brands, products, and sources appear credible, familiar, and consistently associated with a topic across the web. That process naturally favours entities with strong availability signals.
In other words, the signals marketing science has been studying for decades are increasingly the same signals machines use to decide who gets recommended.
Which means the measurement question is not really about prompts, rankings, or visibility.
It is about competitiveness.
And, importantly, many of these signals cannot be inferred from behavioural data alone. Mental availability, brand salience, recall, and perception are typically measured through surveying, panel studies, and market research. They require asking people what they remember, recognise, and associate with a category.In other words, the most meaningful metrics in this system are not extracted from the interface. They are observed directly from the market.
A framework for measuring competitiveness
If the question shifts from “how visible are we?” to “how competitive are we?”, then the measurement model needs to change with it.
Visibility is an outcome. It reflects something about the strength of the underlying signals, but it does not explain them. If we want to understand why a brand is being surfaced, cited, recommended, or chosen, we have to measure the factors that actually shape those decisions.
Those factors tend to cluster into a handful of structural capabilities.
Not SEO tactics. Not ranking signals. Capabilities that determine whether a brand is easy to notice, easy to trust, and easy to choose.
I’ve defined a model, in which those capabilities fall into six broad dimensions:
- Experience integrity
- Physical availability
- Mental availability
- Distinctiveness
- Reputation
- Commercial proof
These are not SEO metrics. They are structural properties of competitiveness.
And when those signals strengthen, something interesting tends to happen.
Visibility improves.
Competitiveness is structural
The interesting thing about those six dimensions is that they are not tactics. They are structural properties of a brand in a market.
You cannot meaningfully “optimise” them in the narrow, mechanical sense SEO grew up with. You strengthen them by building better products, being present in more places, earning trust, and making the brand easier to recognise and remember.
That process usually starts with the most unglamorous of the six: experience integrity.
If the underlying experience is unreliable, confusing, slow, or disappointing, everything else eventually erodes. Customers abandon purchases, reviews skew negative, recommendations dry up, and reputation becomes unstable. The web accumulates evidence of those failures remarkably quickly, and modern search and AI systems are extremely good at noticing those patterns.
Above that foundation sits availability.
Brands only get chosen in places where they are present. Physical availability describes how widely a product or service is distributed across the environments where buying decisions occur: marketplaces, retailers, comparison sites, review platforms, directories, documentation hubs, and structured data ecosystems.
Mental availability sits alongside this. It describes whether the brand actually comes to mind when a buying situation occurs. Byron Sharp refers to these situations as category entry points, the moments and motivations that trigger a purchase.
When someone thinks “I need a new dashcam” or “I should upgrade my running shoes”, certain brands appear in their mental shortlist. Branded search demand, direct traffic, and share of search are imperfect but useful clues about how often that happens.
The key point is that mental availability and physical availability amplify one another.
A brand that is easy to recall but hard to buy wastes demand. A brand that is widely distributed but rarely remembered struggles to convert it.
That interplay leads naturally to distinctiveness and reputation.
Distinctiveness is the mechanism that makes mental availability stick. Brands that can be recognised quickly impose less cognitive load on the people encountering them. Visual assets, naming conventions, tone of voice, and consistent messaging all contribute.
Reputation then stabilises the picture. It reflects the narrative that accumulates around a brand across the web: reviews, editorial coverage, expert commentary, and community discussions. Importantly, reputation is not just about positivity. Stability and coherence matter just as much.
Finally, there is commercial proof.
This is where competitiveness becomes difficult to fake. When customers repeatedly choose a brand even when alternatives are available, the market is sending a clear signal. Conversion efficiency, price resilience, retention, repeat purchase behaviour, and referrals all reveal whether the brand is genuinely preferred.
Taken together, these capabilities describe something much closer to real competitiveness than any ranking dashboard ever could.
Visibility is the reflection layer
When those structural capabilities strengthen, something predictable tends to happen.
The brand becomes easier to find.
Search visibility rises. Mentions increase. AI systems reference it more often. Traffic grows. The dashboards that SEOs have traditionally relied on begin to move in the right direction.
But those movements are reflections of the underlying competitiveness, not the mechanism that created it.
Because visibility is observable, it is tempting to treat it as the thing being optimised. Rankings become targets. Mentions become goals. Prompt inclusion becomes a KPI.
But the relationship runs the other way.
Visibility sits on top of the system. It reflects the aggregated signals beneath it. When a brand improves its experience, expands its availability, becomes easier to recall, earns stronger reputation signals, and converts customers more effectively, those changes propagate through the web.
Search engines and AI systems then aggregate those signals and surface the result.
This is also why visibility in AI environments often feels volatile.
Large language models synthesise signals probabilistically from a constantly shifting information landscape. As new evidence appears, reviews accumulate, products improve, competitors launch, and narratives evolve, the aggregate signal changes.
Prompt tracking dashboards tend to interpret this volatility as instability in the models themselves. In many cases it is simply the surface expression of a system that is continuously re-evaluating the competitive landscape.
If visibility is the reflection layer, then the job is not to manipulate the reflection.
It is to strengthen the signals that the reflection is built from.
What you measure shapes what you optimise
Metrics are never neutral.
The numbers a discipline chooses to measure quietly shape the behaviour of everyone working inside it. They determine what gets prioritised, what gets funded, and what gets reported as success.
For most of SEO’s history, the dominant metrics described the behaviour of the search interface. Rankings improved, traffic increased, visibility scores climbed. Those numbers encouraged a style of optimisation focused on influencing the presentation layer of search results.
Sometimes that produced genuinely useful improvements.
But it also encouraged an enormous amount of effort aimed at manipulating surface signals. Chasing rankings. Manufacturing links. Expanding keyword footprints. Building reporting systems around numbers that looked meaningful but were several steps removed from the thing businesses actually cared about.
The shift happening now forces a different set of questions.
If search engines and AI systems increasingly evaluate brands through aggregated signals about experience, reputation, availability, and preference, then the meaningful metrics are the ones that describe those conditions.
Experience integrity.
Mental and physical availability.
Distinctiveness.
Reputation.
Commercial proof.
These are harder things to measure than rankings or traffic. They move more slowly and rarely fit neatly into a single dashboard. But they are much closer to the forces that determine whether a brand gets recommended, remembered, and chosen.
Which means they encourage a different kind of optimisation.
Instead of trying to influence the interface, the work shifts toward strengthening the things that make a brand genuinely competitive.
Improve the product.
Expand distribution.
Build distinctive assets.
Earn trust.
Create experiences customers actually prefer.
When those things improve, the interface tends to follow.
Visibility increases. Mentions become more frequent. Traffic grows. AI systems surface the brand more often, not because someone optimised a prompt, but because the accumulated evidence about the brand changed.
Which brings us to the uncomfortable conclusion SEO is slowly rediscovering.
Clicks didn’t count.
Competitiveness does.
The post Clicks don’t count (and they never did) appeared first on Jono Alderson.