Okuma görünümü

Optimising for the surfaceless web

30 Ekim 2025 saat 18:14

When I wrote about the machine’s emerging immune system, I argued that AI ecosystems would eventually learn to protect themselves. They’d detect manipulation, filter noise, and preserve coherence. They’d start to decide what kinds of information were safe to keep, and which to reject.

That wasn’t a prediction of some distant future. It’s happening now.

Every day, the surface of the web is scraped, compressed, and folded into the models that power the systems we increasingly rely on. In that process, most of what we publish doesn’t survive contact. Duplicate content dissolves. Contradictions cancel out. Persuasive noise is treated as waste heat and vented into the void.

What remains isn’t the web as we know it – it’s something flatter, quieter, and more internal. A composite understanding of the world, shaped by probability and consensus. A compressed version of reality, organised for machines to reason about rather than for humans to explore.

And that’s the new landscape we’re marketing into: one with no pages, no clicks, and no real surface at all.

We built our disciplines on visibility. On the idea that success meant being seen. But as the machine closes itself off, as the interfaces between human experience and machine understanding grow thinner, visibility starts to lose meaning.

The web we built had surfaces. We could design them, decorate them, and fight for space upon them.
That web is gone. There are no more surfaces – only systems. So the question isn’t how to stand out anymore. It’s how to stay inside.

Visibility is legacy

For most of the web’s history, marketing has been an act of performance. You built something on the surface – a website, a post, a page – and hoped that enough people, and enough algorithms, would look at it. The goal was to be discoverable. Searchable. Indexed.

That surface once mattered. It was where meaning lived; a thin film of presentation stretched across a sea of content. We learned how to decorate it, how to manipulate its texture, how to catch the light.

But surfaces don’t mean much to systems that don’t look.

AI doesn’t browse, or scroll, or click. It doesn’t see the things we publish as artefacts, or experiences, or journeys. It sees them as data: a dense accumulation of words and links and relationships. What was once a web of pages becomes, in its eyes, a map of connections and probabilities.

As those systems grow, the visible layer stops being the web’s living tissue and becomes something closer to a residue. A user interface. A translation layer between the model’s internal state and our human need for touchable things.

What we still see – the search results, the snippets, the website visits – are just reflections of a deeper process. The model has already decided what matters before anything becomes visible to us.

Visibility, then, is no longer a measure of relevance. It’s an echo. A by-product of how well you exist in the model’s internal map of the world.

Marketing, then, can’t rely on that shadow anymore. You can’t simply perform on the surface and expect to be understood below it. You have to exist in the substrate – in the model’s inner representation of reality – where things are no longer seen, only known.

The game isn’t about visibility anymore. It’s about becoming part of the organism – being recognised as useful, true, and integral to its functioning.

The machine’s new selection pressure

Every cycle of training, pruning, and retraining redraws the boundaries of what the machine believes to be true.

Information doesn’t simply disappear; it’s weighed, compared, and either reinforced or allowed to fade. Fragments that align with the model’s broader understanding are retained and strengthened. Those that contradict it, or contribute nothing new, dissolve quietly into statistical noise. This isn’t malice – just maintenance; the system trimming what no longer fits.

From a distance, it looks a lot like natural selection.

Ideas that are clear, consistent, and corroborated tend to persist. They become anchor points in the system’s internal map of the world. Messy, manipulative, or self-serving signals lose fidelity with each generation until they’re effectively gone.

In a web of endless redundancy and synthetic repetition, that selection pressure is profound. Only the most stable patterns survive ingestion, compression, and re-ingestion. Everything else – the campaigns, the slogans, the noise – burns off in the friction between training cycles.

That’s what survival looks like now. Not attention. Not engagement. But persistence.

Survival traits: Useful. True. Integral.

If visibility has given way to persistence, the question becomes: what kinds of information survive? What traits make something worth keeping, when the machine is constantly distilling and rewriting the world?

Three patterns seem to emerge. They’re not rules so much as instincts – signals that the system tends to reinforce rather than erase.

Useful

The model keeps what helps it make better predictions. Content that clarifies, contextualises, or resolves ambiguity adds weight. It’s nutritious to the system. It strengthens its ability to connect ideas and answer questions.

Marketing, by contrast, has often specialised in the opposite – in the art of adding noise to capture attention. In a surface-driven world, that worked. In a closed, self-editing one, it reads as contamination.

To persist, information has to improve the model’s understanding of reality. In practice, that means producing things that make the broader web smarter – content that’s cited, contextualised, and consistent wherever it’s encountered.

True

Truth, in this context, isn’t about moral correctness. It’s about statistical coherence – about holding the same shape across many perspectives.

Signals that remain consistent across time, context, and corroboration develop gravity. They become stable landmarks in the machine’s internal map of the world. Contradictions, rebrands, and rhetorical pivots weaken that gravity until the entity itself loses definition.

If the machine can’t tell what you are, it stops believing you exist.

Integral

The most resilient information isn’t the loudest or the most original. It’s the most connected.

Ideas that are cited, linked, quoted, or built upon become structural. They help to shape the scaffolding that other knowledge depends on. Delete them, and the system’s understanding weakens.

That’s what it means to be integral – to occupy a position where absence would create tension. To be so entangled in the model’s sense of how the world fits together that forgetting you would break something.

These are the new markers of durability: usefulness, truthfulness, and integration – the traits that make you hard to erase.

Marketing as symbiosis

If the machine is learning what to keep, then marketing becomes an act of coexistence. The challenge isn’t to outsmart the filters or to force attention; it’s to make yourself part of the pattern the system depends on.

In a sense, the best marketing has always done this. The most durable brands, ideas, and voices are those that made themselves useful to the environment they occupied – not through volume, but through contribution. What’s changed is that the environment is now synthetic, recursive, and self-curating. The audience is a model.

You don’t persuade a model. You support it – with clarity, consistency, and connection, the same qualities it’s built to recognise and reward.

That means thinking less about performance and more about participation. Your goal isn’t to win visibility, but to earn residency – to become something the machine recognises as part of its metabolism.

The idea of “brand” shifts in that light. A brand isn’t just a promise to people anymore; it’s also a pattern that helps machines resolve ambiguity – a consistent cluster of language, data, and context that supports their confidence. When your language, data, and presence align in ways that improve the system’s accuracy, you stop being external. You become infrastructure.

And once that happens, you’re no longer competing for attention at all. You’re competing for existence.

Temporal persistence

Nothing inside the machine stays still for long.

Each new wave of training reshapes the landscape – compressing, merging, and overwriting its understanding of the world. Entities drift. Meanings decay. Facts are reinterpreted. The machine forgets as quickly as it learns.

In that churn, persistence becomes an active process. You don’t exist once; you exist continuously, by reasserting your pattern every time the world is rewritten.

That doesn’t mean producing endless noise. It means maintaining coherence – allowing the traces of your presence to echo across time, context, and medium. The goal isn’t repetition, but reinforcement.

Old marketing rewarded novelty. The machine rewards consistency. It trusts what endures. It preserves what it can still recognise after compression, translation, and paraphrase.

To survive, treat your reputation, language, and data as living systems – resilient enough to deform without breaking. Most campaigns capture a moment; few survive a model update.

Most campaigns are built to capture a moment. Few are built to survive a model update.

Existence as marketing

The machine immune system doesn’t just defend against manipulation. It shapes what becomes visible, repeatable, and credible across the systems we now rely on. Every model update redraws those boundaries a little – deciding what information stays accessible, and what quietly fades.

Our role inside that process is changing. Marketing used to be the art of visibility: we tried to be seen, to be remembered, to be chosen. Now, in a system that no longer looks in any human sense, visibility has no real meaning. What matters is whether we’re included in the model’s internal understanding of how the world works – whether we’re kept.

That’s the strange new responsibility of marketing: to make our presence valuable enough that deletion would leave a gap.

The future belongs to those who remain useful, true, and integral – not as moral virtues, but as traits the system depends on. The rest will fade quietly, not from malice but from optimisation.

The machine will keep compressing the world, folding it inward until there are no real surfaces left – only patterns it trusts to describe reality. Our task is to make sure we’re still one of them.

Because the game isn’t about visibility anymore. It’s about viability – about whether the machine still remembers you when it dreams.

The post Optimising for the surfaceless web appeared first on Jono Alderson.

The Hotmail effect

Jono Alderson

Yazar:Jono Alderson

25 Ekim 2025 saat 13:29

A plumber drove past me last week with a Hotmail email address painted proudly on the back of his van.

No logo. No tagline. No QR code. Just “Plumbing Services”, and an email address that probably hasn’t changed since Windows XP was still popular.

For most of my career, that might have been all I needed to dismiss them. Probably unprofessional. Probably unsophisticated. Probably cash-in-hand. The kind of person who might install your sink backwards, and vanish with your money.

We used to treat things like that as red flags. If you looked unpolished, you were unprofessional. That was the rule.

And those assumptions didn’t come from nowhere – we engineered them. An entire generation of marketers trained businesses to look the part. We told them that trust was a design problem. That confidence could be manufactured. Custom domains. Grid-aligned logos. Friendly sans-serifs and a reassuring tone of voice. We built an industry around polish and convinced ourselves that polish was proof of competence.

But maybe that plumber’s Hotmail address is saying something else.

A human signal

Because in today’s web – a place that somehow manages to be both over-designed and under-built – a Hotmail address might be the last surviving signal of authenticity.

Most of what we see online now tries to look perfect. It tries to read smoothly, to feel effortless, to remove any friction that might interrupt the illusion. But most of it’s bad. Heavy. Slow. Repetitive. Polished in all the wrong ways.

We’ve built an internet that performs competence without actually being competent.

That van doesn’t need a strategist or a brand guide. It doesn’t need a content calendar or a generative workflow. It’s probably been trading under that same email for twenty years – longer than most marketing agencies survive.

And maybe that’s the point. In a world of synthetic competence – of things that mimic expertise without ever having earned it – the rough edges start to look like proof of life.

Because when everything looks professional, nothing feels human. Every brand, every website, every social feed has been tuned into the same glossy template. Perfect kerning, soft gradients, human-but-not-too-human copy. It’s all clean, confident, and hollow.

We’ve flattened meaning into usability. The same design systems, tone guides, and “authentic” stock photography make everything look trustworthy – and therefore suspicious.

It’s the uncanny valley of professionalism: the closer brands get to looking right, the less we might believe them.

The economy of imperfection

So authenticity becomes the next luxury good.

We start to crave friction. We look for cracks – the typos, the unfiltered moments, the signs of human hands. The emails from Hotmail accounts.

And as soon as that desire exists, someone finds a way to monetise it. The aesthetic of imperfection becomes an asset class.

You can see it everywhere now: “shot on iPhone” campaigns, lo-fi ads pretending to be user-generated content, influencers performing spontaneity with agency scripts and lighting rigs.

It’s a full-blown economy of imperfection, and it’s growing fast. The market has discovered that the easiest way to look real is to fake it badly.

The collapse of signal hierarchies

This is what happens when authenticity itself becomes synthetic.

Every signal that once conveyed truth – professionalism, polish, imperfection – can now be generated, packaged, and optimised.

We can fake competence. We can fake vulnerability. We can fake sincerity.

And when everything can be faked, the signals stop working. Meaning collapses into noise.

That’s the broader story of the synthetic web – a world where provenance has evaporated, and where all signals eventually blur into the same static.

The algorithmic loop of trust

Social media has made this worse.

Platforms are teaching us what “authentic” looks like. They amplify the content that triggers trust – the handheld shot, the stuttered delivery, the rough edge. Creators imitate what performs well. The algorithm learns from that imitation.

Authenticity becomes a closed loop, refined and replicated until it’s indistinguishable from the thing it imitates.

We’ve turned seeming human into a pattern that machines can optimise.

The uncanny mirror

That’s the bit that gets under my skin.

Maybe the plumber with the Hotmail address isn’t a relic at all. Maybe he’s the last authentic node in a network that’s otherwise been wrapped in artificial sincerity.

He’s not optimised. He’s not A/B‑tested. He’s just there. Still trading. Still human.

And maybe that’s why he stands out.

Once, a Hotmail address meant amateur hour. Now, it might mean I’m still real. Or maybe it’s just another costume – the next iteration of authenticity theatre. After all, businesses (and their vans, and their email addresses) can certainly just be bought.

Either way, the lesson’s the same. Real things age. They rust, wear down, and carry their history with them. That’s what makes them trustworthy.

And the most convincing thing you can be might just be imperfect.

The post The Hotmail effect appeared first on Jono Alderson.

The death of a website

Jono Alderson

Yazar:Jono Alderson

8 Ekim 2025 saat 21:40

They called me just after midnight. Said they’d found another one. A website that used to be fast, lean, full of life, now lying cold on the server room floor.

I threw on my coat and grabbed my toolkit. Chrome DevTools. A strong coffee. A sense of déjà vu.

When I arrived, the place was already crawling with engineers. The kind with ergonomic chairs and stress twitches. Nobody met my eye. They all knew how this story ended.

The homepage was still up on a monitor, looping a loading animation like a bad joke. Looked fine from across the room – gradients, glassy buttons, a hero image big enough to land a plane on. But the moment I hit View Source, I knew I was staring at another corpse.

Same story as the last dozen. No pulse, no structure, no soul. Just a tangle of wrappers, scripts, and styles, every line stepping on the toes of the one before it. A thousand dependencies all arguing about whose fault it was.

I scrolled deeper. The markup was dense and panicked, trying to hold itself together.

Once upon a time, you could tell what a site was for just by glancing at its code. Headlines, paragraphs, lists – neat little rows of intention. You could read it. Like a book.

Now it’s all ceremony and scaffolding.

Somewhere in the mess, I caught a glimpse of the old world – a clean <h1>, a fragment of actual content. The outline of meaning, trapped under layers of build artefacts. It was enough to remind me what this used to be. What it should be.

Cause of death? Same as always. Blunt-force abstraction.

Weapon of choice: framework, heavy calibre. Probably React, maybe Vue.

You can tell by the wounds – duplicated imports, mismatched modules, hydration scripts still twitching long after the heartbeat’s gone.

Nobody meant to kill it. They never do.
They just wanted to make it better. Faster. Scalable.

Every dev swears they’re saving time; none of them notice the cost.

The logs told the rest of the story. Requests timing out. Scripts looping. Memory bleeding out by the kilobytes. Third-party tags nesting like rats in the foundation. The database crying for help in stack traces nobody reads.

In the corner, a junior developer was pacing. Fresh out of bootcamp, eyes wide, wearing a hoodie that said Move Fast, Fix Things. He told me they’d followed all the best practices – modular architecture, headless CMS, microservices, graph queries, everything in the cloud.

I nodded. That’s how they all start. Then they optimise for the wrong things until there’s nothing left to save.

He said the Core Web Vitals were nearly in the green. That they’d made great progress on improving their LCP scores this quarter.

I told him I’d seen those reports before. Clean charts, broken sites. Numbers that make investors smile while users drown in latency.

You can fake a pulse for a while. You can’t fake a heartbeat.

There was a silence then, the kind that fills a room after truth walks in.

I closed the laptop, took a sip of my coffee, and stared at the glow on the wall. It wasn’t the first site I’d seen go down this way. It wouldn’t be the last.

They think it’s random, a one-off, a bad deploy. But I’ve been on this beat long enough to know a pattern when I see one.

The whole city’s crawling with corpses – corporate homepages bloated on tracking scripts, e‑commerce storefronts drowning in dependencies, portfolios buckling under their own JavaScript. Every week there’s another obituary, another site that “just stopped working” after the last sprint – traffic and visibility flatlined.

And still, they keep building. New tools, new frameworks, new ways to wrap the same mistakes in shinier wrappers. I see the same scene play out over and over. PMs asking for “more interactivity”. Designers chasing whatever Apple’s doing this week. Engineers stacking abstraction on abstraction, until the original content’s just a rumour.

Nobody writes websites anymore. They assemble them; out of parts they don’t understand, shipped by systems they can’t see, to deliver experiences nobody asked for.

By morning, they’ll push a patch. They’ll call it a fix. And the corpse will twitch again – enough to convince the client that it’s still alive.

I pack up my gear. Another case closed. Another line in the report.

“Cause of death: systemic neglect. Contributing factors: build complexity, managerial optimism, chronic misuse of JavaScript.”

Outside, the rain’s coming down hard, pixelated by the streetlights. I watch the reflection of a thousand broken sites shimmer in the puddles.

The web used to be a city that never slept. Now it’s a morgue with good Wi-Fi.

If you want to see it for yourself, you don’t need me.

Pick a site. Any site.
Pop the hood.
Hit View Source.
And tell me you don’t smell the crime.

I used to think these were murders. Something deliberate. A hand on the keyboard, a choice that killed a page.

But after a while, you see the pattern. It isn’t malice – it’s neglect. A slow rot that seeps in through every sprint, every deadline, every “good enough for now.”

Nobody’s swinging a hammer. They’re just walking away from the cracks.

Whole neighbourhoods of the web are crumbling like that – bright from a distance, hollow underneath.

Teams push updates, patch bugs, add new dependencies, all to keep the lights on a little longer. Then they move on, leave the old code flickering in the dark.

We’ve normalised it. The sluggish loads, the broken forms, the half-working interactions – background noise in a city that forgot what silence sounds like.

Ask around, and nobody even calls it decay anymore. They call it iteration.

That’s the trick of it. The crime isn’t that the web is dying. It’s that we’ve stopped treating it like something alive.

So when the next call comes in – another homepage gone cold, another app bleeding users – I’ll pour another coffee and head back out.

Because that’s what you do in this line of work.

You don’t solve the case.

You just keep showing up to document the decline.

The post The death of a website appeared first on Jono Alderson.

Marketing against the machine immune system

Jono Alderson

Yazar:Jono Alderson

24 Eylül 2025 saat 15:32

Marketing has always been about the art of misdirection. We take something ordinary, incomplete, or even broken, and we wrap it in story. We build the impression of value, of inevitability, of trustworthiness. The surface gleams, even if the foundations are cracked.

And for decades, that worked – not because the products or experiences were always good, but because the audience was human.

Humans are persuadable. We’re distractible, emotional, and inconsistent. We’ll forgive a slow checkout if the branding feels credible. We’ll look past broken links if the discount seems tempting. We’ll excuse an awkward interface if the ad campaign made us laugh. Marketing thrived in those gaps – in the space between what something is, and how it can be made to feel.

But the audience is shifting.

Increasingly, it isn’t people at the front line of discovery or decision-making. It’s machines. Search engines, recommenders, shopping agents, IoT devices, and large language models. These systems decide which products we see, which services we compare, and which sources we trust. In many cases, they carry the process to completion – making the recommendation, completing the transaction, providing the answer – before a human ever gets involved.

And unlike people, machines don’t shrug and move on when something’s off. Every flaw – a slow page, a misleading data point, a broken flow, a clumsy design choice – gets logged. They remember. Relentlessly. At scale. And at scale, those memories aren’t inert. They accumulate. They shape behaviour. And they may be the difference between being surfaced or never being recommended at all.

How machines remember

Machines log everything. Or, more precisely, they log everything that matters to them.

Every interaction, every transaction, every request leaves a trace somewhere. We know this because it already happens.

Web crawlers track status codes, file sizes, and response times.
Browsers feed back anonymised performance metrics.
Payment processors log retries, declines, and timeouts.
IoT devices record whether an API responded in time, or not at all.

And as more of our experiences flow through agents and automation, it’s reasonable to expect the same habits to spread. Checkout assistants, shopping bots, recommendation engines, voice systems – all of them are under pressure to learn from what happens when they interact with the world. Logging is the cheapest, most reliable way to do that.

At small scale, a log is just a line in a file. One record among billions. But as those records accumulate, patterns emerge.

A single timeout might be a blip.
A thousand timeouts look like unreliability.
One contradictory data point is noise.
A hundred is evidence of inconsistency.

Logs turn a one-off interaction into something that can be measured, compared, and acted on later.

The challenge is scale. Billions of requests create billions of logs. Too many to store forever. Too noisy to read line by line. Too expensive to process directly.

So machines compress. They smooth detail into summary, so they can carry forward what matters without drowning in history.

Latency collapses into an average response time.
Error codes collapse into a failure rate.
Conflicting inputs collapse into inconsistent source.
Human behaviour collapses into low engagement.

This isn’t emotional judgment. It’s economics. Without compression, systems choke. With it, they remember – not every detail, but the distilled conclusion of how you performed.

And once those conclusions exist, they don’t just sit there. They begin to shape how the system treats you the next time it encounters you.

Why machines share their memories

A single system’s memory is useful, but limited. A crawler may learn that your pages are slow, a shopping assistant may discover your checkout fails too often, a voice agent may notice that your hours don’t match reality. Each of these insights has value, but only to the system that recorded it. Everyone else has to pay the cost of re-learning the same lesson.

That duplication is wasteful. And these systems are already under pressure. Crawlers operate under strict compute budgets. Assistants live or die by their credibility with users. Models are insatiable, always hungry for feedback to refine their outputs. For all of them, repeating the same experiments, burning through cycles, and rediscovering the same flaws is an inefficiency they can’t really afford.

So the logic points toward sharing. Not because machines are altruistic, but because cooperation is cheaper than duplication.

We’ve already seen this play out elsewhere. Browsers exchange phishing and malware lists. Banks and payment processors pool fraud fingerprints. Email providers collaborate on spam signatures. Competitors in name, but collaborators in practice – because the cost of facing the same threat in isolation is higher than the cost of sharing intelligence.

There’s every reason to believe that agents, crawlers, and models will follow the same path. Maybe not by Google handing logs to Bing, but through other channels. Third-party clearinghouses. Shared standards and APIs. Training data that encodes the lessons learned elsewhere. However it happens, the effect is the same: what one system concludes about you rarely stays contained for long.

And that’s the part most businesses miss. A slow site, a broken endpoint, a misleading schema – these aren’t just local problems. They have a way of leaking, of spreading, of becoming the version of you that other systems inherit. Your flaws don’t just live where they happened; they circulate. And once they do, they start to shape how the network as a whole decides whether to trust you.

The machine immune system in action

Individually, logs are just traces. Summaries are just shorthand. Sharing is just efficiency. But together, they start to behave like something else.

When patterns are pooled and reinforced across systems, they stop being isolated judgments and begin to act like collective reflexes. What one crawler has concluded about your site’s reliability can quietly shape how other crawlers treat you. What one assistant has flagged as inconsistent data becomes a caution others inherit. Over time, these aren’t just scattered memories; they’re shared responses.

That’s the moment the metaphor shifts.

Because what we’re describing looks less like bookkeeping and more like biology. An immune system doesn’t need perfect recall of every infection or injury. It doesn’t replay the blow-by-blow of each encounter. Instead, it compresses experience into signatures – antibodies – and carries them forward. The next time it encounters a threat, it doesn’t hesitate; it recognises, and it responds.

Machines are beginning to behave the same way.

A pattern of timeouts doesn’t just sit in a log; it becomes a reason to crawl you less often.
A series of checkout failures doesn’t simply vanish once fixed; it lingers as a reason not to recommend you.
A mismatch between your published hours and reality doesn’t just frustrate one user; it creates a systemic doubt about your trustworthiness.

What begins as logging and compression hardens into scar tissue. That scar tissue travels, circulates, and shapes future behaviour. And once it does, the whole ecosystem starts to defend itself – not out of malice, but as a form of optimisation.

This is what emerges when memories are pooled and reinforced: a machine immune system, quietly learning to protect itself from waste, failure, and mistrust.

The flywheel of forgetting

Immune systems don’t just respond; they remember. And that memory is sticky.

For humans, reputations fade. A bad review is eventually buried. A clumsy campaign is forgotten once the next news cycle rolls on. Machines don’t work that way. Once a judgment has been compressed into a summary and circulated, it becomes a durable part of the record.

That durability creates a trap.

Fix a slow site, and the raw logs might show you’ve improved – but if those logs have already been smoothed into a shorthand like “site is unreliable”, the shorthand survives.
Correct your schema or your store hours, but if assistants have already shared the judgment “data is inconsistent”, that label will continue to shape how they treat you.

And because fewer systems will bother to revisit you, your opportunities to generate fresh, positive signals shrink. You’re starved of the very evidence you’d need to overturn the old diagnosis.

This is the flywheel:

Weaknesses are recorded.
Records are compressed into durable summaries.
Summaries spread across systems.
Spreading reduces your chances of rewriting the story.
Reduced chances keep the old judgment alive.

It’s not malice. It’s mechanics. In a network optimised for efficiency and trust, bad memories are easier to keep than to re-evaluate.

The result is a form of structural stickiness: once you’ve been marked down, recovery isn’t just difficult – it’s asymmetrical. The effort required to dig yourself out is many times greater than the effort it took to fall in.

Marketing without misdirection

The tricks that once defined marketing are failing. For decades, you could plaster over weak products, fragile services, or clunky experiences with the right story. Humans could be persuaded. Machines cannot.

That doesn’t just make misdirection harder; it makes it irrelevant. In a machine-mediated ecosystem, every flaw leaves a trace, every failure persists, and every judgment spreads. The immune system doesn’t care what you meant to say. It only cares how you actually performed.

So what takes marketing’s place?

Let’s call it Agent Relations.

If the old discipline was about shaping human perception, the new one is about shaping machine memory. It means understanding how crawlers, recommenders, shopping bots, and language models record, compress, and share their experiences of you. It means designing products, pages, and processes that generate the right kinds of traces. It means maintaining the kind of technical integrity that resists being scarred in the first place.

That doesn’t sound like the marketing we’re used to. It sounds closer to operations, QA, or infrastructure. But in a landscape where machines are the gatekeepers of discovery and recommendation, this is marketing.

The story you tell still matters – but only if it survives contact with the evidence.

Living with machine immune systems

What we are building is bigger than search engines, shopping bots, or voice assistants. It’s an ecosystem that behaves like a body. Crawlers, recommenders, APIs, and models are its cells. Logs are its memories. Shared summaries are its antibodies. Scar tissue is its reputation.

And like any immune system, its priority isn’t your survival. It’s its own.

If the network decides you are a source of friction – too slow, too inconsistent, too misleading, too unreliable – it will defend itself the only way it knows how. It will avoid you. It will stop visiting your site, stop recommending your product, stop trusting your data. Not out of malice, but as a reflex.

For businesses, that means invisibility. For marketers, it means irrelevance.

The old reflex – to polish the story, distract the audience, misdirect their attention – has no traction here. Machines aren’t persuaded by narrative. They’re persuaded by experience.

That’s why the future of marketing isn’t storytelling at all. It’s engineering trust into the systems that machines depend on. It’s building processes, data, and experiences that resist scarring. It’s practising Agent Relations – ensuring that when machines remember you, what they remember is worth carrying forward.

Because in the age of machine immune systems, your brand isn’t what you say about yourself. It’s what survives in their memory.

The post Marketing against the machine immune system appeared first on Jono Alderson.

If you want your blog to sell, stop selling

Jono Alderson

Yazar:Jono Alderson

3 Eylül 2025 saat 08:26

Most brand blogs aren’t bad by accident. They’re bad by design.

You can see the assembly line from a mile away: build a keyword list, sort by volume and “difficulty”, pick the ‘best’ intersects on topics, write the blandest takes, wedge in a CTA, hit publish. Repeat until everyone involved can point at a dashboard and say, “Look, productivity”.

That’s industrialised mediocrity: a process optimised to churn out content that looks like content, without ever risking being interesting.

It isn’t just blogs. Knowledge bases, resource hubs, “insights” pages – all the same sausage machine. They’re the cautious cousins of the blog, stripped of even the pretence of perspective. They offer even less opinion, less differentiation, and less reason for anyone (human or machine) to care.

And it doesn’t work. It doesn’t win attention, it doesn’t earn trust, it doesn’t get remembered. It just adds to the sludge. And worse, all of it comes at a cost. Strategy sessions, planning decks, content calendars, review cycles, sign-offs. Designers polishing graphics nobody will notice. Developers pushing pages nobody will read. Whole teams locked into a treadmill that produces nothing memorable, nothing differentiating, nothing anyone wants to share.

Sure – if you churn out enough of it, you might edge out the competition. A post that’s one percent better than the other drivel might scrape a ranking, pick up a few clicks. Large sites and big brands might even ‘drive’ thousands of visits. And on paper, that looks like success.

But here’s the trap: none of those visitors care. They don’t trust you, they don’t remember you, they don’t come back. They certainly don’t convert. Which is why the business ends up confused and angry: “Why are conversion rates so low on this traffic when our landing pages convert at a hundred times the rate?” Because it was never real demand. It was never built on trust or preference. It was a trap, and it was obviously a trap.

And because even shallow wins look like progress, teams double down. They start measuring content the way they measure ads: by clicks, conversions, and cost-per-acquisition. But that’s how you end up mistaking systemisation for strategy.

Because ads and content are not the same thing. Ads are designed to compel an immediate action. Content can lead to action, but it does so indirectly – by building trust, by earning salience, by being the thing people return to in the messy, wibbly-wobbly bit where they don’t know what they don’t know.

And the more you try to make content behave like an ad, the worse it performs – as content and as advertising. You strip out the qualities that make it engaging, and you fail to generate the conversions you were chasing in the first place.

So if you want your blog to sell, you must stop making it behave like a sales page with paragraphs. Stop optimising for the micro-conversion you can attribute tomorrow, and start optimising for the salience, trust, and experiences that actually move the market over time.

Nobody is proud of this work

The writers know they’re producing beige, generic copy. It isn’t fun to research, it isn’t satisfying to write, and it isn’t something you’d ever share with a friend. It’s just filling slots in a calendar.

Managers and stakeholders know it too. They see the hours lost to keyword analysis, briefs, design assets, endless review cycles – and the output still lands with a thud.

The executives look at the system and conclude that “content doesn’t work.” Which only reinforces the problem. Content doesn’t get taken seriously, budgets get cut, and the teams producing it feel even less motivated.

Worse, they see it as expensive. Lots of salaries, lots of meetings, lots of activity – and little return. So the logic goes: why not mechanise it? Why not let ChatGPT churn out “articles” for a fraction of the cost, and fire the writers whose work doesn’t convert anyway?

And so the spiral deepens. Expensive mediocrity gives way to cheap mediocrity. Filler content floods in at scale. The bar drops further. And the chance of producing anything meaningful, opinionated, or differentiated recedes even further into the background.

And the readership? Humans don’t engage with it. They bounce. Or worse, they skim a paragraph, recognise the shallow, vapid tone, and walk away with a little less trust in the brand. Machines don’t engage either. Search engines, recommendation systems, and AI agents are built to surface authority and usefulness. Beige filler doesn’t register; at best, it’s ignored, at worst, it drags the rest of your site down with it.

It’s a vicious circle. Content becomes a chore, not a craft. Nobody enjoys it, nobody champions it, nobody believes in it. And the people (and systems) it was meant to serve see it for what it is: mass-produced, risk-averse filler.

Why it persists anyway

If everyone hates the work and the results, why does the machine keep running?

Because it’s measurable.

Traffic numbers, click-through rates, assisted conversions – all of it shows up neatly on a dashboard. It creates the illusion of progress. And in organisations where budgets are defended with charts, that’s often enough.

So content gets judged against the same metrics as ads. If a landing page converts at 5%, then a blog post should surely convert at some fraction of that. If a campaign tracks cost-per-click, then surely content should too. This is how ad logic seeps into content strategy – until every blog post is treated like a sales unit with paragraphs wrapped around it.

The irony is that content’s real value is in the things you can’t attribute neatly: trust, salience, preference. But because those don’t plot cleanly on a graph, they’re sidelined. Dashboards win arguments, even if the numbers are meaningless.

And the blind spots are bigger than most teams admit. A 2% conversion rate gets celebrated as success, but nobody asks about the other 98%. Most of those experiences are probably neutral and inconsequential. But some are negative – and impactfully so. The impact of those negative experiences compounds; it shows up in missing citations, hostile mentions, being excluded from reviews, or simply never being recommended.

That’s survivable when you can keep throwing infinite traffic at the funnel. But in an agentic world, where systems like ChatGPT are effectively “one user” deciding what gets surfaced, you don’t get a hundred chances. You get one. Fail to be the most useful, the most credible, the most compelling, and you’re filtered out.

Mediocrity isn’t just wasteful anymore. It’s actively dangerous.

You can’t have it both ways

This is where the sales logic creeps in. Someone in the room says, “Why not both? Be useful and generate sales. Add a CTA. Drop a promo paragraph. Make sure the content calendar lines up neatly with our product areas.”

That’s the point where the whole thing collapses. Because the moment the content is forced to sell, it stops being useful. It can’t be unbiased while also promoting the thing you happen to sell. It can’t be trusted while also upselling. It becomes cautious, compromised, grey.

And here’s the deeper problem: authentic, opinionated content doesn’t start from sales. It starts from a perspective – an idea, an experience, a frustration, a contrarian take. That’s what makes it readable, citeable, and memorable.

This is why Google keeps hammering on about E‑E-A‑T. The extra “E” – Experience – is their way of forcing the issue: they don’t want generic words; they want a lived perspective. Something that proves a human was here, who knows what they’re talking about, and who’s prepared to stand behind it.

Try to wrangle an opinion piece into a sales pitch, and you break it. Readers feel the gearshift. The tone becomes disingenuous, the bias becomes obvious, and the trust evaporates.

Flip it around and it’s just as bad. Try to start from a product pitch and expand it into an “opinion” piece, and you end up with something even worse: content that pretends to be independent thought, but is transparently an ad in prose form. Nobody buys it.

And ghostwriting doesn’t solve the problem. Slapping the CEO’s name and face on a cautious, committee-written post doesn’t magically make it human. Readers can tell when there’s no lived experience, no vulnerability, no genuine opinion. It’s still filler – just with a mask.

And if your articles map one-to-one with your service pages, they’re not blog posts at all. They’re brochures with paragraphs. Nobody shares them. Nobody cites them. Nobody trusts them.

The definitive answer to “How will this generate sales?” is: Not directly. Not today, not on the page. Its job is to build trust, salience, and preference – so that sales happen later, elsewhere, because you mattered.

Try to make content carry the sales quota, and you ruin both.

What success really looks like

If conversion rates and click-throughs aren’t the point, what is?

Success isn’t a form fill. It isn’t a demo request or a sale. It isn’t a 2% conversion rate on a thousand blog visits.

Success looks like discovery and salience. It looks like being the brand whose explainer gets bookmarked in a WhatsApp group. The one whose guide is quietly passed around an internal Slack channel. The article that gets cited on Wikipedia, or linked by a journalist writing tomorrow’s feature.

Success looks like becoming part of the messy middle. When people loop endlessly through doubt, reassurance, comparison, and procrastination, your content is the thing they keep stumbling across. Not because you trapped them with a CTA, but because you helped them.

It looks like being the name an analyst drops into a report, the voice invited onto a podcast, or the perspective that gets picked up in an interview. It looks like turning up where people actually make up their minds, not just where they click.

These are the real signals of salience – harder to track, but far more powerful than a trickle of gated downloads.

And here’s the thing: none of it happens if your “content” is just brand-approved filler. People don’t remember “the brand blog” – they remember perspectives, stories, and ideas worth repeating.

That doesn’t mean corporate or anonymous content can never work. They can – but there’s no quicker signal that a piece is going to be generic and forgettable than when the author is listed as “Admin” or simply the company name. If nobody is willing to stand behind it, why should anyone bother to read it?

A blog post is only a blog post if it carries the authentic, interesting opinion of a person (or, perhaps, system). Known or unknown, polished or raw, human or synthetic, what matters is that there’s a voice, a perspective, and a point of view. Otherwise, your blog is just an article repository. And in a world already drowning in corporate sludge, that’s no moat at all.

That means putting people in the loop. Authors with a voice. People with experience, perspective, humour, or even the willingness to be disagreeable. Industrialised mediocrity is safe, scalable, and forgettable. Authored content is risky, personable, and memorable. And only one of those has a future.

“But our competitors don’t do this”

They don’t. And that’s the point.

Most big companies favour systemisation over strategy. They’d rather be trackable than meaningful. They’d rather be safe than useful. They’d rather produce cautious filler that nobody hates, than take the risk of publishing something that someone might actually love.

And the way they get there is identical. They employ the same junior analysts, point them at the same keyword tools, and ask them to churn out the same “content calendars” and to-do lists. The result is inevitable: the same banal articles, repeated across every brand in the category.

That’s why their blogs are indistinguishable. It’s why their “insights” hubs blur into one another. It’s why nobody can remember a single thing they’ve ever said.

If you copy them, you inherit their mediocrity. If you differentiate, you have a chance to matter.

Stop selling to sell

Buying journeys aren’t linear. People loop endlessly through doubt, reassurance, procrastination, and comparison. They don’t need traps; they need help. If your blog is engineered like an ad, it can’t be there for them in those loops.

The irony is that the most commercially valuable content is often the least “optimised” for conversions. The ungated how-to guide that answers the question directly. The explainer that solves a problem outright instead of hiding the answer behind a form. The resource that doesn’t generate a lead on the page, but earns a hundred links, a thousand citations, and a permanent place in the conversation.

That’s what salience looks like. You see it in journalists’ citations, in podcast invitations, in analysts’ reports. Those are measurable signals, just not the ones dashboards were built for. They’re the breadcrumbs of authority and trust – the things that compound into sales over time.

And this isn’t just about blogs. The same applies to your “insights” hub, your knowledge base, your whitepapers. If it’s industrialised mediocrity, it won’t matter. If it’s authored, opinionated, and differentiated, it can.

So stop trying to make every page a conversion engine. Accept that ads and content are different things. Be useful, be generous, be memorable. The sales will follow – not because you forced them, but because you earned them.

Does this post stand up to scrutiny?

I see the irony. You’re reading this on a site with a sidebar and footer, trying to sell you my consultancy. Guilty as charged. But the advert is over there, being an advert. This post is over here, being a post. The problem isn’t advertising. The problem is when you blur the two, pretend your brochure is a blog, and end up with neither: not a real advert, not a real blog – just another forgettable blur in the sludge.

Maybe you’re wondering whether this post lives up to its own argument. Does it have a voice? Does it show experience, perspective, and opinion – or is it just another cleverly-worded filler piece designed to prop up a consultancy?

That’s exactly the tension. Authentic, opinionated writing is hard. It takes time, craft, vulnerability, and the risk of saying something someone might disagree with. It’s much easier to churn out safe words and tick the boxes.

And yes, here’s another irony: does it matter that I used ChatGPT to shortcut some of the labour-intensive parts of the writing and editing process? I don’t think so. Because what matters is that there’s still a human voice, perspective, and experience at the heart of it. The machine helped with polish; it didn’t supply the worldview.

That’s the line. Tools can support. Even systemisation can support. They can speed up editing, remove friction, and help distribute the work. But they can’t replace lived experience, a contrarian stance, or the willingness to risk saying something in your own voice. Strip those away, and you’re back in the land of industrialised mediocrity.

The post If you want your blog to sell, stop selling appeared first on Jono Alderson.

A complete guide to HTTP caching

Jono Alderson

Yazar:Jono Alderson

29 Ağustos 2025 saat 15:53

Caching is the invisible backbone of the web. It’s what makes sites feel fast, reliable, and affordable to run. Done well, it slashes latency, reduces server load, and allows even fragile infrastructure to withstand sudden spikes in demand. Done poorly – or ignored entirely – it leaves websites slow, fragile, and expensive.

At its core, caching is about reducing unnecessary work. Every time a browser, CDN, or proxy has to ask your server for a resource that hasn’t changed, you’ve wasted time and bandwidth. Every time your server has to rebuild or re-serve identical content, you’ve added load and cost. Under heavy traffic – whether that’s Black Friday, a viral news story, or a DDoS attack – those mistakes compound until the whole stack buckles.

And yet, despite being so fundamental, caching is one of the most misunderstood aspects of web performance. Many developers:

Confuse no-cache with “don’t cache,” when it actually means “store, but revalidate”.
Reach for no-store as a “safe” default, unintentionally disabling caching entirely.
Misunderstand how Expires interacts with Cache-Control: max-age.
Fail to distinguish between public and private, leading to security or performance issues.
Ignore advanced directives like s-maxage or stale-while-revalidate.
Don’t realise that CDNs, browsers, proxies, and application caches all layer their own rules on top.

The result? Countless sites ship with fragile, inconsistent, or outright broken caching policies. They leave money on the table in infrastructure costs, frustrate users with sluggish performance, and collapse under load that better-configured systems would sail through.

This guide exists to fix that. Over the next chapters, we’ll unpack the ecosystem of HTTP caching in detail:

How headers like Cache-Control, Expires, ETag, and Age actually work, alone and together.
How browsers, CDNs, and app-level caches interpret and enforce them.
The common pitfalls and misconceptions that can trip up even experienced developers.
Practical recipes for static assets, HTML documents, APIs, and more.
Modern browser behaviours, like BFCache, speculation rules, and signed exchanges.
CDN realities, with a deep dive into Cloudflare’s defaults, quirks, and advanced features.
How to debug and verify caching in the real world.

By the end, you’ll not only understand the nuanced interplay of HTTP caching headers – you’ll know how to design and deploy a caching strategy that makes your sites faster, cheaper, and more reliable.

The business case for caching

Caching matters because it directly impacts four fundamental outcomes of how a site performs and scales:

Speed

Caching eliminates unnecessary network trips. A memory-cache hit in the browser is effectively instant, compared to the 100–300ms you’d otherwise wait just to complete a handshake and see the first byte. Multiply that by dozens of assets and you get smoother page loads, lower Core Web Vitals, and happier users.

Resilience

When demand surges, cache hits multiply capacity. If 80% of traffic is absorbed by a CDN edge, your servers only need to handle the other 20%. That’s the difference between sailing through Black Friday and collapsing under a viral traffic spike.

Cost

Every cache hit is one less expensive origin request. CDN bandwidth is cheap; uncached origin hits consume CPU, database queries, and outbound traffic that you pay for. A 5–10% improvement in cache hit ratio can translate directly into thousands of dollars saved at scale. And that’s not even counting when requests are cached in users’ browsers, and don’t even hit the CDN!

SEO

Caching improves both speed and efficiency for search engines. Bots are less aggressive when they see effective caching headers, conserving crawl budget for fresher and deeper content. Faster pages also feed directly into Google’s performance signals.

Real-world Scenarios

A news site avoids a meltdown during a breaking story because 95% of requests are served from the CDN cache.
An API under sustained load continues to respond consistently thanks to stale-if-error and validator-based revalidation.
An e‑commerce platform handles Black Friday traffic smoothly because static assets and category pages are long-lived at the edge.

Side note on the philosophy of caching

It’s worth acknowledging that there’s a quiet anti-culture around caching. Some developers see it as a hack – a band-aid slapped over slow systems, masking deeper flaws in design or architecture. In an ideal world, every request would be cheap, every response instant, and caching wouldn’t even be needed. And there’s merit in that vision: designing systems to be inherently fast avoids the complexity and fragility that caching introduces.

In practice, most of us don’t live in that world. Real systems face unpredictable spikes, long geographic distances, and sudden swings in demand. Even the best-architected applications benefit from caching as an amplifier. The key is balance: caching should never excuse poor underlying performance, but it should always be part of how you scale and stay resilient when traffic surges.

Mental model: who caches what?

Before diving into the fine-grained details of headers and directives, it helps to understand the landscape of who is actually caching your content. Caching isn’t a single thing that happens in one place — it’s an ecosystem of layers, each with its own rules, scope, and quirks.

Browsers

Every browser maintains both a memory cache and a disk cache. The memory cache is extremely fast but short-lived – it only lasts while a page is open – and is designed to avoid redundant network fetches during a single session. It isn’t governed by HTTP caching headers: even resources marked no-store may be reused from memory if they’re requested again within the same page. The disk cache, by contrast, persists across tabs and sessions, can hold much larger resources, and does respect HTTP caching headers (though browsers may still apply their own heuristics when metadata is missing).

Proxies

Between the browser and the wider internet, requests often pass through proxies – especially in corporate environments or ISP-managed networks. These proxies can act as shared caches, storing responses to reduce bandwidth costs or to enforce organisational policies. Unlike CDNs, you usually don’t configure them yourself, and their behaviour may be opaque.

For example, a corporate proxy might cache software downloads to avoid repeated gigabyte transfers across the same office connection. An ISP might cache popular news images to improve load times for customers. The problem is that these proxies don’t always respect HTTP caching headers perfectly, and they may apply their own heuristics or overrides. That can lead to inconsistencies, like a user behind a proxy seeing a stale or stripped-down response long after it should have expired.

While less visible than browser or CDN caches, proxies are still an important part of the ecosystem. They remind us that caching isn’t always under the site owner’s direct control – and that intermediaries in the network can influence freshness, reuse, and even correctness.

Side note on transparent ISP proxies

In the early 2000s, many ISPs deployed “transparent” proxies that cached popular resources without users or site owners even knowing. They still crop up in some regions today. These proxies sit silently between the browser and the origin, caching opportunistically to save bandwidth. The downside is that they sometimes ignore cache headers entirely, serving outdated or inconsistent content. If you’ve ever seen a site behave differently at home vs on mobile data, a transparent proxy might have been the reason.

Shared caches

Between users and origin servers sit a host of shared caches – CDNs like Cloudflare or Akamai, ISP-level proxies, corporate gateways, or reverse proxies. These shared layers can dramatically reduce origin load, but they come with their own logic and sometimes override or reinterpret origin instructions.

Reverse proxies

Technologies like Varnish or NGINX can act as local accelerators in front of your application servers. They intercept and cache responses close to the origin, smoothing traffic spikes and offloading heavy lifting from your app or database.

Application and database caches

Inside your stack, systems like Redis or Memcached store fragments of rendered pages, precomputed query results, or sessions. They aren’t governed by HTTP headers – you design the keys and TTLs yourself – but they are crucial parts of the caching ecosystem.

Cache keys and variants

Every cache needs a way to decide whether two requests are “the same thing” or not. That decision is made using a cache key – essentially, the unique identifier for a stored response.

By default, a cache key is based on the scheme, host, path, and query string of the requested resource. But in practice, browsers add more dimensions. Most implement double-keyed caching, where the top-level browsing context (the site you’re on) is also part of the key. That’s why your browser can’t reuse a Google Font downloaded while visiting one site when another, unrelated site requests the same font file – each gets its own cache entry, even though the URL is identical.

Modern browsers are moving towards triple-keyed caching, which adds subframe context into the key as well. This means a resource requested inside an embedded iframe may have its own cache entry, separate from the same resource requested by the top-level page or by another iframe. This design improves privacy (by limiting cross-site tracking via shared cache entries), but it also reduces opportunities for cache reuse.

On top of that, HTTP adds another layer of complexity: the Vary header. This tells caches that certain request headers should also be part of the cache key.

Examples:

Vary: Accept-Encoding → store one copy compressed with gzip, another with brotli.
Vary: Accept-Language → store separate versions for en-US vs de-DE.
Vary: Cookie → every unique cookie value creates a separate cache entry (often catastrophic).
Vary: * → means “you can’t safely reuse this for anyone else,” which effectively kills cacheability.

This is powerful, and sometimes essential. If your server switches image formats based on Accept headers, or serves AVIF to browsers that support it, you must use Vary: Accept to avoid sending incompatible responses to clients that can’t handle them. At the same time, Vary is easy to misuse. Carelessly adding Vary: User-Agent, Vary: Cookie, or Vary: * can explode your cache into thousands of near-duplicate entries. The key is to vary only on headers that genuinely change the response – nothing more.

That’s where normalisation comes in. Smart CDNs and proxies can simplify cache keys, collapsing away differences that don’t matter. For example:

Ignoring analytics query parameters (e.g., ?utm_source=...).
Treating all iPhones as the same “mobile” variant, instead of keying on every device string.

The balance is to vary only on things that truly change the response. Anything else is wasted fragmentation and lower hit ratios.

Side note on No-Vary-Search

A new experimental header, No-Vary-Search, lets servers tell caches to ignore certain query parameters when deciding cache keys. For example, you could treat ?utm_source= or ?fbclid= as irrelevant and avoid fragmenting your cache into thousands of variants. At the moment, support is limited – Chrome only uses it with speculation rules – but if adopted more widely, it could offer a standards-based way to normalise cache keys without relying on CDN configuration.

Freshness vs validation

Knowing who is caching your content and how they decide whether two requests are the same only answers part of the question. The other part is when a stored response can be reused.

Every cache, whether it’s a browser or a CDN, has to decide:

Is this copy still fresh enough to serve as-is?
Or has it gone stale, and do I need to check with the origin?

That’s the core trade-off in caching: freshness (serve immediately, fast but risky if outdated) versus validation (double-check with the origin, slower but guaranteed correct).

All the headers we’ll explore next – HTTP headers like Cache-Control, Expires, ETag, and Last-Modified help us to guide this decision-making process.

Core HTTP caching headers

Now that we know who caches content and how they make basic decisions, it’s time to look at the raw materials: the headers that control caching. These are the levers you pull to influence every layer of the system – browsers, CDNs, proxies, and beyond.

At a high level, there are three categories:

Freshness controls: tell caches how long a response can be served without revalidation.
Validators: provide a way to check cheaply if something has changed.
Metadata: describe how the response should be stored, keyed, or observed.

Let’s break them down.

The `Date` header

Every response should carry a Date header. It’s the server’s timestamp for when the response was generated, and it’s the baseline for all freshness and age calculations. If Date is missing or skewed, caches will make their own assumptions.

The `Cache-Control` (response) header

This is the most important header – the control panel for how content should be cached. It carries multiple directives, split into two broad groups:

Freshness directives:

max-age: how long (in seconds) the response is fresh.
s-maxage: like max-age, but applies only to shared caches (e.g. CDNs). Overrides max-age there.
immutable: signals that the resource will never change (ideal for versioned static assets).
stale-while-revalidate: allows serving a stale response while fetching a fresh one in the background.
stale-if-error: allows serving stale content if the origin is down or errors.

Storage/use directives:

public: response may be stored by any cache, including shared ones.
private: response may be cached only by the browser, not shared caches.
no-cache: store, but revalidate before serving.
no-store: do not store at all.
must-revalidate: once stale, the response must be revalidated before use.
proxy-revalidate: same, but targeted at shared caches.

The `Cache-Control` (request) header

Browsers and clients can also send caching directives. These don’t change the server’s headers, but they influence how caches along the way behave.

no-cache: forces revalidation (but allows use of stored entries).
no-store: bypasses caching entirely.
only-if-cached: instructs to return a cached response if available, otherwise error (useful offline).
max-age, min-fresh, max-stale: fine-tune tolerance for staleness.

The `Expires` header

An older way of defining freshness, based on providing an absolute date/timestamp.

Example: Expires: Wed, 29 Aug 2025 12:00:00 GMT.
Ignored if Cache-Control: max-age is present.
Vulnerable to clock skew between servers and clients.
Still widely seen, often for backwards compatibility.

The `Pragma` header

The Pragma header dates back to HTTP 1.0 and was used to prevent caching before Cache-Control existed (on requests; asking intermediaries to revalidate content before reuse). Modern browsers and CDNs now rely on Cache-Control, but some intermediaries and older systems still respect Pragma. In theory, it could take any arbitrary name/value pairs; in practice, only one ever mattered: Pragma: no-cache.

For maximum compatibility – especially when dealing with mixed or legacy infrastructure – it’s harmless to include both.

The `Age` header

Age tells you how old the response is (in seconds) when delivered. It’s supposed to be set by shared caches, but not every intermediary implements it consistently. Browsers never set it. Treat it as a helpful signal, not an absolute truth.

Side note on Age

You’ll only ever see Age headers from shared caches like CDNs or proxies. Why? Because browsers don’t expose their internal cache state to the network – they just serve responses directly to the user. Shared caches, on the other hand, need to communicate freshness downstream (to other proxies, or to browsers), so they add Age. That’s why you’ll often see Age: 0 on a fresh CDN hit, but never on a pure browser cache hit.

Validator headers: `ETag` and `Last-Modified`

When freshness runs out, caches use validators to avoid re-downloading the whole resource.

ETag: a unique identifier (opaque string) for a specific version of a resource.
- Strong ETags ("abc123") mean byte-for-byte identical.
- Weak ETags (W/"abc123") mean semantically the same, though bytes may differ (e.g. re-gzipped).
Last-Modified: timestamp of when the resource last changed.
- Less precise, but still useful.
- Supports heuristic freshness when max-age/Expires are missing.
Conditional requests:
- If-None-Match (with ETag) → server replies 304 Not Modified if unchanged.
- If-Modified-Since (with Last-Modified) → same, but based on date.
- Both save bandwidth and reduce load, because only headers are exchanged.

Side note on strong vs weak ETags

An ETag is an identifier for a specific version of a resource. A strong ETag ("abc123") means byte-for-byte identical – if even a single bit changes (like whitespace), the ETag must change. A weak ETag (W/"abc123") means “semantically the same” – the content may differ in trivial ways (e.g. compressed differently, reordered attributes) but is still valid to reuse.

Strong ETags give more precision, but can cause cache misses if your infrastructure (say, different servers behind a load balancer) generates slightly different outputs. Weak ETags are more forgiving, but less strict. Both work with conditional requests – the choice is about balancing precision vs practicality.

Side note on ETags vs Cache-Control headers

Cache-Control directives are processed before the ETag. If it determines that a resource is stale, the cache uses the ETag (or Last-Modified) to revalidate with the origin. Think of it this way:

While fresh: the cache serves the copy immediately, no validation.
When stale: the cache sends If-None-Match: "etag-value".

If the origin replies 304 Not Modified, the cache can keep using the stored copy without re-downloading the whole thing. Without Cache-Control, the ETag may be used for heuristic freshness or unconditional revalidation – but that usually means more frequent trips back to the origin. The two are designed to work together: Cache-Control sets the lifetime, ETags handle the check-ups.

The `Vary` header

The Vary header tells caches which request headers should be factored into the cache key. It’s what allows a single URL to have multiple valid cached variants. For example, if a server responds with Vary: Accept-Encoding, the cache will store one copy compressed with gzip and another compressed with brotli. Each encoding is treated as a distinct object, and the right one is chosen depending on the next request.

This flexibility is powerful, but also easy to misuse. Setting Vary: * is effectively the same as saying “this response can never be reused safely for anyone else”, which makes it uncacheable in shared caches. Similarly, Vary: Cookie is notorious for destroying hit rates, because every unique cookie value creates a separate cache entry.

The best approach is to keep Vary minimal and intentional. Only vary on headers that truly change the response in a meaningful way. Anything else just fragments your cache, lowers efficiency, and adds unnecessary complexity.

Observability helpers

Modern caches don’t just make decisions silently – they often add their own debugging headers to help you understand what happened. The most important of these is Cache-Status, a new standard that reports whether a response was a HIT or a MISS, how long it sat in cache, and sometimes even why it was revalidated. Many CDNs and proxies also use the older X-Cache header for the same purpose, typically showing a simple HIT or MISS flag. Cloudflare goes a step further with its cf-cache-status header, which distinguishes between HIT, MISS, EXPIRED, BYPASS and DYNAMIC (and other values).

These headers are invaluable when tuning or debugging, because they reveal the cache’s own decision-making rather than just echoing your origin’s intent. A response might look cacheable on paper, but if you see a steady stream of MISS or DYNAMIC, it probably means that the intermediary isn’t following your headers the way you expect.

Freshness & age calculations

Once you understand who caches content and which headers control their behaviour, the next step is to see how those pieces come together in practice. Every cache – whether it’s a browser, a CDN, or a reverse proxy – follows the same logic:

Work out how long the response should be considered fresh.
Work out how old the response currently is.
Compare the two, and decide whether to serve, revalidate, or fetch anew.

This is the hidden math that drives every “cache hit” or “cache miss” you’ll ever see.

Freshness lifetime

The freshness lifetime tells a cache how long it can serve a response without re-checking with the origin. To work that out for a given request, caches look for the following HTTP response headers in a strict order of precedence:

Cache-Control: max-age (or s-maxage) → overrides everything else.
Expires → an absolute date, used only if max-age is absent.
Heuristic freshness → if neither of those directives is present, caches guess.

Example 1: `max-age`

Date: Tue, 29 Aug 2025 12:00:00 GMT
Cache-Control: max-age=300

Here, the server explicitly tells caches, “This response is good for 300 seconds after the Date”. That means the response can be considered fresh until 12:05:00 GMT. After that, it becomes stale unless revalidated.

Example 2: `Expires`

Date: Tue, 29 Aug 2025 12:00:00 GMT
Expires: Tue, 29 Aug 2025 12:10:00 GMT

There’s no max-age, but Expires provides an absolute cutoff. Caches compare the Date (12:00:00) with the Expires time (12:10:00). That’s a 10-minute freshness window: the response is fresh until 12:10:00, then stale.

Example 3: Heuristic

Date: Tue, 29 Aug 2025 12:00:00 GMT
Last-Modified: Mon, 28 Aug 2025 12:00:00 GMT

With no max-age or Expires, caches fall back to heuristics. Browsers have varying approaches; Chrome uses 10% of the time since the last modification. Here, the resource was last modified 24 hours ago, so the cache should be considered fresh for 2.4 hours (until about 14:24:00 GMT), after which revalidation kicks in.

Current age

The current age is the cache’s estimate of how old the response is right now. The spec gives a formula, but we can break it into steps:

Apparent age = now – Date (if positive).
Corrected age = max(Apparent age, Age header).
Resident time = how long it’s been sitting in the cache.
Current age = Corrected age + Resident time.

Example 4: Simple case

Date: Tue, 29 Aug 2025 12:00:00 GMT
Cache-Control: max-age=60

The response was generated at 12:00:00 and reached the cache at 12:00:05, so it already appeared to be 5 seconds old when it arrived. With no Age header present, the cache then held onto it for another 15 seconds, making the total current age 20 seconds. Since the response had a max-age of 60 seconds, it was still considered fresh.

Example 5: With `Age` header

Date: Tue, 29 Aug 2025 12:00:00 GMT
Age: 30
Cache-Control: max-age=60

The origin sends a response stamped with Date: 12:00:00 and also includes Age: 30, meaning some upstream cache already held it for 30 seconds. When a downstream cache receives it at 12:00:40, it looks 40 seconds old. The cache takes the higher of the two (40 vs 30) and then adds the 20 seconds it sits locally until 12:01:00. That makes the total current age 60 seconds – exactly matching the max-age=60 limit. At that point, the response is no longer fresh and must be revalidated.

Decision tree

Once a cache knows both numbers:

If current age < freshness lifetime → Serve immediately (fresh hit).
If current age ≥ freshness lifetime →
- If stale-while-revalidate → Serve stale now, revalidate it in the background.
- If stale-if-error and origin is failing → Serve stale.
- Else → Revalidate with origin (conditional GET/HEAD).

Example 6: `stale-while-revalidate`

Cache-Control: max-age=60, stale-while-revalidate=30

A response has Cache-Control: max-age=60, stale-while-revalidate=30. At 12:01:10, the cache’s copy is 70 seconds old – 10 seconds beyond its freshness window. Normally, that would require a revalidation before serving, but stale-while-revalidate allows the cache to serve the stale copy instantly as long as it revalidates in the background. Because the copy is only 10 seconds into its 30-second stale allowance, the cache can safely serve it while updating in parallel.

Example 7: stale-if-error

Cache-Control: max-age=60, stale-if-error=600

Another response has Cache-Control: max-age=60, stale-if-error=600. At 12:02:00, the copy is 120 seconds old – well past its 60-second freshness lifetime. The cache tries to fetch a fresh version, but the origin returns a 500 error. Thanks to stale-if-error, the cache is allowed to fall back to its stale copy for up to 600 seconds while the origin remains unavailable, ensuring the user still gets a response.

Why this matters

Understanding the math explains a lot of “weird” behaviour:

A resource expiring “too soon” may be down to a short max-age or a non-zero Age header.
A response that looks stale but is served anyway may be covered by stale-while-revalidate or stale-if-error.
A 304 Not Modified doesn’t mean caching failed – it means the cache correctly revalidated and saved bandwidth.

Caches aren’t mysterious black boxes. They’re just running these calculations thousands of times per second, across millions of resources. Once you know the math, the behaviour becomes predictable – and controllable. But in practice, developers often trip over subtle defaults and misleading directive names. Let’s tackle those misconceptions head-on.

Common misconceptions & gotchas

Even experienced developers misconfigure caching all the time. The directives are subtle, the defaults are quirky, and the interactions are easy to misunderstand. Here are some of the most common traps.

`no-cache` ≠ “don’t cache”

The name is misleading. no-cache actually means “store this, but revalidate before reusing it.” Browsers and CDNs will happily keep a copy, but they’ll always check back with the origin before serving it. If you truly don’t want anything stored, you need no-store.

`no-store` means nothing is kept

no-store is the nuclear option. It instructs every cache – browser, proxy, CDN – not to keep a copy at all. Every request goes straight to the origin. This is correct for highly sensitive data (e.g. banking), but overkill for most use cases. Many sites use it reflexively, throwing away huge performance gains.

`max-age=0` vs `must-revalidate`

They seem similar, but aren’t the same. max-age=0 means “this response is immediately stale”. Without must-revalidate, caches are technically allowed to reuse it briefly under some conditions (e.g. if the origin is temporarily unavailable). Adding must-revalidate removes that leeway, forcing caches to always check with the origin once freshness has expired.

`s-maxage` vs `max-age`

max-age applies everywhere – browsers and shared caches alike. s-maxage only applies to shared caches like CDNs or proxies, and it overrides max-age there. This lets you set a short freshness window for browsers (say, max-age=60) but a longer one at the CDN (s-maxage=600). Many developers don’t realise s-maxage even exists.

`immutable` misuse

immutable tells browsers “this resource will never change, don’t bother revalidating it”. That’s fantastic for fingerprinted assets (like app.9f2d1.js) that are versioned by filename. But it’s dangerous for HTML or any resource that might change under the same URL. Use it on the wrong thing, and you’ll lock users into stale content for months.

Redirect and error caching

Caches can and do store redirects and even error responses. A 301 is cacheable by default (often permanently). Even a 404 or 500 may be cached briefly, depending on headers and CDN settings. Developers are often surprised when “temporary” outages linger because an error response was cached.

Clock skew and heuristic surprises

Caches compare Date, Expires, and Age headers to decide freshness. If clocks aren’t perfectly in sync, or if no explicit headers are present, caches fall back to heuristics. That can lead to surprising expiry behaviour. Explicit freshness directives are always safer.

Cache fragmentation: devices & geography

Caching is simple when one URL maps to one response. It gets tricky when responses vary by device or region.

Device splits: Sites often serve different HTML or JS for desktop vs mobile. If keyed on User-Agent, every browser/version combination becomes a separate cache entry; the result is that cache hit rates collapse. Safer options include normalising User-Agents at the CDN, or using Client Hints (Sec-CH-UA, DPR) with controlled Vary headers.
Geo splits: Serving different content by region (e.g. India vs Germany) often uses Accept-Language or GeoIP rules. But every language combination (en, en-US, en-GB) creates a new cache key. Unless you normalise by region/ruleset, your cache fragments into thousands of variants.

The trade-off is clear: more personalisation usually means less caching efficiency. Once the traps are clear, we can move from theory to practice. Here are the caching “recipes” you’ll use for different content types.

Patterns & recipes

Now that we’ve covered the mechanics and the common pitfalls, let’s look at how to put caching into practice. These are the patterns you’ll reach for again and again, adapted for different kinds of content.

Static assets (JS, CSS, fonts)

Goal: Serve instantly, never revalidate, safe to cache for a very long time.

Typical headers:

Cache-Control: public, max-age=31536000, immutable

Why:

Fingerprinted filenames (app.9f2d1.js) guarantee uniqueness, so old versions can stay cached forever.
Long max-age means they never expire in practice.
immutable stops browsers from wasting time revalidating.

HTML documents

The right TTL depends on how often your HTML changes and how quickly changes must appear. Use one of these profiles, and pair long edge TTLs with event-driven purge on publish/update.

Profile A: High-change (news/homepages):

Cache-Control: public, max-age=60, s-maxage=300, stale-while-revalidate=60, stale-if-error=600
ETag: "abc123"

Rationale: keep browsers very fresh (1m), let the CDN cushion load for 5m, serve briefly stale while revalidating for snappy UX, and survive origin wobbles.

Profile B – Low-change (blogs/docs):

Cache-Control: public, max-age=300, s-maxage=86400, stale-while-revalidate=300, stale-if-error=3600
ETag: "abc123"

Rationale: browsers can reuse for a few minutes; CDN can hold for a day to slash origin traffic. On publish/edit, purge the page (and related pages) to make changes instantaneously.

Logged-in / personalised pages:

Cache-Control: private, no-cache
ETag: "abc123"

Rationale: allow browser storage but force revalidation every time; never share at the CDN.

Side note on long HTML TTLs are safe with event-driven purge

You can run very long CDN cache expiration times (hours, even days) for HTML as long as you actively bust the cache on important events: publish, update, unpublish. Use CDN features like Cache Tags / surrogate keys to purge collections (e.g., “post-123”, “author-jono”), and trigger purges from your CMS. This gives you the best of both worlds: instant updates when it matters, rock-solid performance the rest of the time.

If updates must appear within seconds with no manual purge → keep short CDNs TTLs (≤5m) + stale-while-revalidate.

If updates are event-driven (publish/edit) → use long CDNs TTLs (hours/days) + automatic purge by tag.

If content is personalised → don’t share (use private, no-cache + validators).

APIs

Goal: Balance freshness with performance and resilience.

Typical headers:

Cache-Control: public, s-maxage=30, stale-while-revalidate=30, stale-if-error=300
ETag: "def456"

Why:

Shared caches (CDNs) can serve results for 30s, reducing load.
stale-while-revalidate keeps latency low even as responses are refreshed.
stale-if-error ensures reliability if the backend fails.
Clients can revalidate cheaply with ETags.

Side note on why APIs use short s-maxage + stale-while-revalidate

APIs often serve data that changes frequently, but not every single second. A short s-maxage (e.g. 30s) lets shared caches like CDNs soak up most requests, while still ensuring data stays reasonably fresh.

Adding stale-while-revalidate smooths over the edges: even if the cache has to fetch a new copy, it can serve the slightly stale one instantly while revalidating in the background. That keeps latency low for users.

The combination gives you a sweet spot: low origin load, fast responses, and data that’s “fresh enough” for most real-world use cases.

Authenticated dashboards & user-specific pages

Goal: Prevent shared caching, but allow browser reuse.

Typical headers:

Cache-Control: private, no-cache
ETag: "ghi789"

Why:

private ensures only the end-user’s browser caches the response.
no-cache allows reuse, but forces validation first.
ETags prevent full downloads on every request.

Side note on the omission of max-age

For user-specific content, you can’t risk serving stale data. That’s why the recipe uses private, no-cache but leaves out max-age.

no-cache means the browser may keep a local copy, but must revalidate it with the origin before reusing it.

If you added max-age, you’d be telling the browser it’s safe to serve without checking – which could expose users to out-of-date account info or shopping carts.

Pairing no-cache with an ETag gives you the best of both worlds: safety (always validated) and efficiency (cheap 304 Not Modified responses instead of re-downloading everything).

Side note on security

When handling or presenting sensitive data, you may wish to use private, no-store instead, in order to prevent the browser from storing a locally available cached version. This reduces the likelehood of leaks on devices used by multiple users, for example.

Images & media

Goal: Cache efficiently across devices, while serving the right variant.

Typical headers:

Cache-Control: public, max-age=86400
Vary: Accept-Encoding, DPR, Width

Why:

A one-day freshness window balances speed with flexibility – images can change, but not as often as HTML.
Vary allows different versions to be cached for different devices or display densities.
CDNs can normalise query parameters (e.g. ignore utm_*) and collapse variants intelligently to avoid fragmentation.

Side note on client hints

Modern browsers send Client Hints like DPR (device pixel ratio) and Width (intended display width) when requesting images. If your server or CDN supports responsive images, it can generate and return different variants — e.g. a high-res version for a retina phone, a smaller one for a low-res laptop.

By including Vary: DPR, Width, you’re telling caches: “Store separate copies depending on these hints.” That ensures the right variant is reused for future requests with the same device characteristics.

The catch? Every new DPR or Width value creates a new cache key. If you don’t normalise (e.g. bucket widths into sensible breakpoints), your cache can fragment into hundreds of variants. CDNs often provide built-in rules to manage this.

Beyond headers: browser behaviours

HTTP headers set the rules, but browsers have their own layers of optimisation that can look like “caching” – or interfere with it. These don’t follow the same rules as Cache-Control or ETag, and they often confuse developers when debugging.

Back/forward cache (BFCache)

What it is: A full-page snapshot (DOM, JS state, scroll position) kept in memory when a user navigates away.
Why it matters: Going “back” or “forward” feels instant because the browser restores the page without even hitting HTTP caches.
Gotchas: Many pages aren’t BFCache-eligible. The most common blockers are unload handlers, long-lived connections, or the use of certain browser APIs. Another subtle but important one is Cache-Control: no-store on the document itself – this tells the browser not to keep any copy around, which extends to BFCache. Chrome has recently carved out a small set of exceptions where no-store pages can still enter BFCache in safe cases, but for the most part, if you want BFCache eligibility, you should avoid no-store on documents.

Side note on BFCache vs HTTP Cache

BFCache is like pausing a tab and resuming it – the entire page state is frozen and restored. HTTP caching only stores network resources. A page might fail BFCache but still be quite fast thanks to HTTP cache hits (or vice versa).

Hard refresh vs soft reload

Soft reload (e.g. pressing the reload button): Browser will use cached responses if they’re still fresh. If stale, it revalidates.
Hard refresh (e.g. opening DevTools and right-clicking the reload button to do a fuller reload, or ticking the “disable cache” button): Browser bypasses the cache, re-fetching all resources from the origin.
Gotcha: Users may think “refresh” always fetches new content – but unless it’s a hard refresh, caches still apply.

Speculation rules & link relations

Browsers provide developers with tools that let them (pre)load resources, before the user requests them. These don’t change how caching works, but they can change what ends up in the cache ahead of time.

Prefetch: The browser may fetch resources speculatively and place them in cache, but only for a short window. If they’re not used quickly, they’ll be evicted.
Preload: Resources are fetched early and inserted into cache so they’re ready by the time the parser needs them.
Prerender: The entire page and its subresources are loaded and cached in advance. When a user navigates, it all comes straight from cache rather than the network.
Speculation rules API: Eviction, freshness, and validation usually follow the normal caching rules – but prerendering makes some exceptions. For example, Chrome may prerender a page even if it’s marked with Cache-Control: no-store or no-cache. In those cases, the prerendered copy lives in a temporary store that isn’t part of the standard HTTP cache and is discarded once the prerender session ends (though this behaviour may vary by browser).

The key takeaway: speculation rules are about cache timing, but not cache policy. They front-load work so the cache is already warm, but freshness and expiry are still governed by your headers.

Signed exchanges (SXG)

Signed exchanges don’t change cache mechanics either, but they do change who can serve cached content while keeping origin authenticity intact.

An SXG is a package of a response, plus a cryptographic signature from the origin.
Intermediaries (like Google Search) can store and serve that package from their own caches.
When the browser receives it, it can trust the content as if it came from your domain, while still applying your headers for freshness and validation.

The catch: SXGs have their own signature expiry in addition to your normal caching headers. Even if your Cache-Control allows reuse, the SXG may be discarded once its signature is out of date.

SXGs also support varying by cookie, which means they can package and serve different signed variants depending on cookie values. This enables personalised experiences to be cached and distributed via SXG, but it fragments the cache heavily – every cookie combination creates a new variant.

Key takeaway: SXG adds another clock (signature lifetime) and, if you use cookie variation, another source of cache fragmentation. Your headers still govern freshness, but these extra layers can shorten reuse windows and multiply cache entries.

CDNs in practice: Cloudflare

So far, we’ve looked at how browsers handle caching and the directives that control freshness and validation. But for most modern websites, the first and most important cache your traffic will hit isn’t the browser — it’s the CDN.

Cloudflare is one of the most widely used CDNs, fronting millions of sites. It’s a great example of how shared caches don’t just passively obey your headers. They add defaults, overrides, and proprietary features that can completely change how caching works in practice. Understanding these quirks is essential if you want your origin headers and your CDN behaviour to align.

Defaults and HTML Caching

By default, Cloudflare doesn’t cache HTML at all. Static assets like CSS, JavaScript, and images are stored happily at the edge, but documents are always passed through to the origin unless you explicitly enable “Cache Everything.” That default catches many site owners out: they assume Cloudflare is shielding their servers, when in reality their most expensive requests – the HTML pages themselves – are still hitting the backend every time.

The temptation, then, is to flip the switch and enable “Cache Everything.” But this blunt tool applies indiscriminately, even to pages that vary by cookie or authentication state. In that scenario, Cloudflare can end up serving cached private dashboards or logged-in user data to the wrong people.

The safer pattern is more nuanced: bypass the cache when a session cookie is present, but cache aggressively when the user is anonymous. This approach ensures that public pages reap the benefits of edge caching, while private content is always fetched fresh from the origin.

Side note on Cloudflare’s APO addon

Cloudflare’s Automatic Platform Optimization (APO) addon integrates with WordPress websites, and rewrites caching behaviour so HTML can be cached safely while respecting logged-in cookies. It’s a good example of CDNs layering platform-specific heuristics on top of standard HTTP logic.

Edge vs browser lifetimes

Your origin headers – things like Cache-Control and Expires – define how long a browser should hold onto a resource. But CDNs like Cloudflare add another layer of control with their own settings, such as “Edge Cache TTL” and s-maxage. These apply only to what Cloudflare stores at its edge servers, and they can override whatever the origin says without changing how the browser behaves.

That separation is both powerful and confusing. From the browser’s perspective, you might see max-age=60 and assume the content is cached for just a minute. Meanwhile, Cloudflare could continue serving the same cached copy for ten minutes, because its edge cache TTL is set to 600 seconds. The result is a split reality: browsers refresh often, but Cloudflare still shields the origin from repeated requests.

Cache keys and fragmentation

Cloudflare uses the full URL as its cache key. That means every distinct query parameter – whether it’s a tracking token like ?utm_source=… or something trivial like ?v=123 – creates a separate cache entry. Left unchecked, this behaviour quickly fragments your cache into hundreds of near-identical variants, each one consuming space while reducing the hit rate.

It’s important to note that canonical URLs don’t help here. Cloudflare doesn’t care what your HTML declares as the “true” version of a page; it caches by the literal request URL it receives. To avoid fragmentation, you need to explicitly normalise or ignore unnecessary parameters in Cloudflare’s configuration, ensuring that trivial differences don’t splinter your cache.

Site note on normalising cache keys

Cloudflare lets you define which query parameters to ignore, or how to collapse variants. Stripping out analytics paramaters, for example, can dramatically improve cache hit ratios.

Device and geography splits

Cloudflare also allows you to customise cache keys by including request headers, such as User-Agent or geo-based values. In theory, this enables fine-grained caching — one version of a page for mobile devices, another for desktop, or distinct versions for visitors in different countries.

But in practice, unless you normalise these inputs aggressively, it can explode into massive fragmentation. Caching by raw User-Agent means every browser and version string generates its own entry, instead of collapsing them into a simple “mobile vs desktop” split. The same problem arises with geographic rules: caching by full Accept-Language headers, for example, can create thousands of variants when only a handful of languages are truly necessary.

Done carefully, device and geography splits let you serve tailored content from cache. Done carelessly, they destroy your hit rate and multiply origin load.

Cache tags

Cloudflare also supports tagging cached objects with labels – for example, tagging every page of a blog post with blog-post-123. These tags allow you to purge or revalidate whole groups of resources at once, rather than expiring them one by one.

For CMS-driven sites, this is a powerful tool: when an article is updated, the site can trigger a purge for its tag and instantly invalidate every related URL. But over-tagging – attaching too many labels to too many resources – is common, and can undermine efficiency and make purge operations slower or less predictable.

Other caching layers in the stack

So far, we’ve focused on browser caches, HTTP directives, and CDNs like Cloudflare. But many sites add even more layers between the user and the origin. Reverse proxies, application caches, and database caches all play a role in what a “cached” response actually means.

These layers don’t always speak HTTP – Redis doesn’t care about Cache-Control, and Varnish can happily override your origin headers. But they still shape the user experience, infrastructure load, and the headaches of cache invalidation. To understand caching in the real world, you need to see how these pieces stack and interact.

Application & database caches

Inside the application tier, technologies like Redis and Memcached are often used to keep session data, fragments of rendered pages, or precomputed query results. An ecommerce site, for example, might cache its “Top 10 Products” list in Redis for sixty seconds, saving hundreds of database queries every time a page loads. This is fantastically efficient – until it isn’t.

One common failure mode is when the database updates, but the Redis key isn’t cleared at the right moment. In that case, the HTTP layer happily serves “fresh” pages that are already out of date, because they’re pulling from stale Redis data underneath.

The inverse problem happens just as often. Imagine the app has correctly refreshed Redis with a new product price, but the CDN or reverse proxy still has an HTML page cached with the old price. The origin told that the outer cache that the page was valid for five minutes, so until the TTL runs out (or someone manually purges it), users continue seeing stale HTML – even though Redis already has the update.

In other words: sometimes HTTP looks fresh while Redis is stale, and sometimes Redis is fresh while HTTP caches are stale. Both failure modes stem from the same root issue – multiple caching layers, each with its own logic, falling out of sync.

Reverse proxy caches

One layer closer to the edge, reverse proxies like Varnish or NGINX often sit in front of the application servers, caching whole responses. In principle, they respect HTTP headers, but in practice, they’re usually configured to enforce their own rules. A Varnish configuration might, for example, force a five-minute lifetime on all HTML pages, regardless of what the origin headers say. That’s excellent for resilience during a traffic spike, but dangerous if the content is time-sensitive. Developers frequently run into this mismatch: they open DevTools, inspect the origin’s headers, and assume they know what’s happening – not realising that Varnish is rewriting the rules one hop earlier.

Service Workers

Service Workers add another cache layer inside the browser, sitting between the network and the page. Unlike the built-in HTTP cache, which just follows headers, the Service Worker Cache API is programmable. That means developers can intercept requests and decide – in JavaScript – whether to serve from cache, fetch from the network, or do something else entirely.

This is powerful: a Service Worker can precache assets during install, create custom caching strategies (stale-while-revalidate, network-first, cache-first), or even rewrite responses before handing them back to the page. It’s the foundation of Progressive Web Apps (PWAs) and offline support.

But it comes with pitfalls. Because Service Workers can ignore origin headers and invent their own logic, they can drift out of sync with the HTTP caching layer. For example, you might set Cache-Control: max-age=60 on an API, but a Service Worker coded to “cache forever” will happily serve stale results long after they should have expired. Debugging gets trickier too: responses can look cacheable in DevTools but actually be served from a Service Worker’s script.

The key takeaway: Service Workers don’t replace HTTP caching – they stack on top of it. They give developers fine-grained control, but they also add another layer where things can go wrong if caching strategies conflict.

Layer interactions

The real complexity comes when all these layers interact. A single request might pass through the browser cache, then Cloudflare, then Varnish, and finally Redis. Each layer has its own rules about freshness and invalidation, and they don’t always line up neatly. You might purge the CDN and think you’ve fixed an issue, but the reverse proxy continues to serve its stale copy. Or you might flush Redis and repopulate the data, only to discover the CDN is still serving the “old” version it cached earlier. These kinds of mismatches are the root cause of many mysterious “cache bugs” that show up in production.

Debugging & verification

With so many caching layers in play – browsers, CDNs, reverse proxies, application stores – the hardest part of working with caching is often figuring out which cache served a response and why. Debugging caching isn’t about staring at a single header; it’s about tracing requests through the stack and verifying how each layer is behaving.

Inspecting headers

The first step is to look closely at the headers. Standard fields like Cache-Control, Age, ETag, Last-Modified, and Expires tell you what the origin intended. But they don’t tell you what the caches actually did. For that, you need the debugging signals added along the way:

Age shows how long a response has been sitting in a shared cache. If it’s 0, the response likely came from origin. If it’s 300, you know a cache has been serving the same object for five minutes.
X-Cache (used by many proxies) or cf-cache-status (Cloudflare) show whether a cache hit or miss occurred.
Cache-Status is the emerging standard, adopted by CDNs like Fastly, which reports not just HIT/MISS but also why a decision was made.

Together, these headers form the breadcrumb trail that tells you where the response has been.

Using browser DevTools

The Network panel in Chrome or Firefox’s DevTools is essential for seeing cache behaviour from the user’s side. It shows whether a resource came from disk cache, memory cache, or over the network.

Memory cache hits are near-instant but short-lived, surviving only within the current tab/session.
Disk cache hits persist across sessions but may be evicted.
304 Not Modified responses reveal that the browser revalidated the cached copy with the origin.

It’s also worth testing with different reload types. A normal reload (Ctrl+R) may use cached entries, while a hard reload (Ctrl+Shift+R) bypasses them entirely. Knowing which type of reload you’re performing avoids false assumptions about what the cache is doing.

CDN logs and headers

If you’re using a CDN, its logs and headers are often the most reliable source of truth. Cloudflare’s cf-cache-status, Akamai’s X-Cache, and Fastly’s Cache-Status headers all reveal edge decisions. Most providers also expose logs or dashboards where you can see hit/miss ratios and TTL behaviour at scale.

For example, if you see cf-cache-status: MISS or BYPASS on every request, it usually means Cloudflare isn’t storing your HTML at all – either because it’s following defaults (no HTML caching), or because a cookie is bypassing cache. Debugging at the edge often comes down to correlating what your origin sent, what the CDN says it did, and what the browser eventually received.

Reverse proxies and custom headers

Reverse proxies like Varnish or NGINX can be more opaque. Many deployments add custom headers like X-Cache: HIT or X-Cache: MISS to reveal proxy behaviour. If those aren’t available, logs are your fallback: Varnish’s varnishlog and NGINX’s access logs can both show whether a request was served from cache or passed through.

The tricky part is remembering that reverse proxies may override headers silently. If you see Cache-Control: no-cache from origin but a five-minute TTL in Varnish, the headers in DevTools won’t tell you the full story. You need the proxy’s own debugging signals to verify.

Following the request path

When in doubt, step through the request chain:

Browser → Check DevTools: was it memory, disk, or network?
CDN → Inspect cf-cache-status, Cache-Status, or X-Cache.
Proxy → Look for custom headers or logs to confirm whether the request hit local cache.
Application → See if Redis/Memcached served the data.
Database → If all else fails, confirm the query ran.

Walking layer by layer helps isolate where the stale copy lives. It’s rarely the case that “the cache is broken.” More often, one cache is misaligned while the others are behaving perfectly.

Common debugging mistakes

There are a few traps developers fall into repeatedly:

Only looking at browser headers: These tell you what the origin intended, not what the CDN actually did.
Assuming 304 Not Modified means no caching: In fact, it means the cache did store the response and successfully revalidated it.
Forgetting about cookies: A stray cookie can make a CDN bypass cache entirely.
Testing with hard reloads: A hard reload bypasses the cache, so it doesn’t reflect normal user experience. The same is true if you enable the “Disable cache” tickbox in DevTools – that setting forces every request to skip caching entirely while DevTools is open. Both are useful for troubleshooting, but they give you an artificial view of performance that real users will never see.
Ignoring multi-layer conflicts: Purging the CDN but forgetting to clear Varnish, or clearing Redis but leaving a stale copy at the edge.

Good debugging is less about clever tricks and more about being systematic: check each layer, verify its decision, and compare against what you expect from the headers.

Caching in the AI-mediated web

Up to now, we’ve treated caching as a conversation between websites, browsers, and CDNs. But increasingly, the consumers of your site aren’t human users at all – they’re search engine crawlers, LLM training pipelines, and agentic assistants. These systems rely heavily on caching, and your headers can shape not just performance, but how your brand and content are represented in machine-mediated contexts.

Crawl & scrape efficiency

Search engines and scrapers rely on HTTP caching to avoid re-downloading the entire web every day. Misconfigured caching can make crawlers hammer your origin unnecessarily, or worse, cause them to give up on deeper pages if revalidation is too costly. Well-tuned headers keep crawl efficient and ensure that fresh updates are discovered quickly.

Training data freshness

LLMs and recommendation systems ingest web content at scale. If your resources are always marked no-store or no-cache, they may get re-fetched inconsistently, leading to patchy or outdated snapshots of your site in training corpora. Conversely, stable cache policies help ensure that what makes it into these models is consistent and representative.

Agentic consumption

In an AI-mediated web, agents may act on behalf of users – shopping bots, research assistants, travel planners. For these agents, speed and reliability are first-class signals. A site with poor caching may look slower or less consistent than its competitors, biasing agents away from recommending it. In this sense, caching isn’t just about performance for humans – it’s about competitiveness in machine-driven decision-making.

Fragmentation risks

If caches serve inconsistent or fragmented variants – split by query strings, cookies, or geography – that noise propagates into machine understanding. A crawler or model might see dozens of subtly different versions of the same page. The result isn’t just poor cache efficiency; it’s a fractured representation of your brand in training data and agent outputs.

Wrapping up: caching as strategy

Caching is often treated as a technical detail, an afterthought, or a hack that papers over performance problems. But the truth is more profound: caching is infrastructure. It’s the nervous system that keeps the web responsive under load, that shields brittle origins, and that shapes how both humans and machines experience your brand.

When it’s configured badly, caching makes sites slower, more fragile, and more expensive. It fragments user experience, confuses crawlers, and poisons the well for AI systems that are already struggling to understand the web. When it’s configured well, it’s invisible — things just feel fast, resilient, and trustworthy.

That’s why caching can’t just be left to chance or to defaults. It needs to be a deliberate strategy, as fundamental to digital performance as security or accessibility. A strategy that spans layers — browser, CDN, proxy, application, database. A strategy that understands not just how to shave milliseconds for a single user, but how to present a coherent, consistent version of your site to millions of users, crawlers, and agents simultaneously.

The web isn’t getting simpler. It’s getting faster, more distributed, more automated, and more machine-mediated. In that world, caching isn’t a relic of the old performance playbook. It’s the foundation of how your site will scale, how it will be perceived, and how it will compete.

Caching is not an optimisation. It’s a strategy.

The post A complete guide to HTTP caching appeared first on Jono Alderson.

You’re loading fonts wrong (and it’s crippling your performance)

Jono Alderson

Yazar:Jono Alderson

21 Ağustos 2025 saat 21:48

Fonts are one of the most visible, most powerful parts of the web. They carry our brands, shape our identities, and define how every word feels. They’re the connective tissue between design, content, and experience.

And yet: almost everyone gets them wrong.

It’s a strange paradox. Fonts are everywhere. Every website uses them. But very few people – designers, developers, even performance specialists – actually know how they work, or how to load them efficiently. The result is a web full of bloated font files, broken loading strategies, poor accessibility, and a huge amount of wasted bandwidth.

Fonts aren’t decoration. They’re infrastructure. They sit on the critical rendering path, they affect performance metrics like LCP and CLS, they carry licensing and privacy baggage, and they directly influence whether users can read, engage, or trust what’s on the page. If you don’t treat them with the same care and discipline you apply to code, you’re hurting your users and your business.

This article is a deep dive into that problem. We’ll look at how we got here – from the history of web-safe fonts and the rise of Google Fonts, through the myths and bad habits that still dominate today. We’ll get into the mechanics of how fonts actually work in browsers, and why the defaults and “best practices” you’ll find online are often anything but.

We’ll explore performance fundamentals, loading strategies, modern CSS techniques, and the global realities of serving text in multiple scripts and languages. We’ll also dig into the legal and ethical side of font usage, and what the future of web typography might look like.

By the end, you’ll know why your current font setup is probably wrong – and how to fix it.

Because if there’s one thing you take away, it’s this: fonts are not free, fonts are not simple, and fonts are not optional. They deserve the same rigour you apply to performance, accessibility, and SEO.

A brief history of webfonts

To understand why so many people still get fonts wrong, you need a bit of history. The way we think about web typography today is still shaped by the compromises, hacks, and half-truths of the last twenty years.

The “web-safe” era

In the early days, there was no such thing as custom web typography. You picked from a handful of “web-safe” system fonts (Arial, Times New Roman, Verdana, Georgia) and hoped they looked the same on your users’ machines. If you wanted anything else, you sliced it into images.

Hacks before @font-face: sIFR and Cufón

Designers wanted brand typography, but browsers weren’t ready. Enter the hacks:

sIFR (Scalable Inman Flash Replacement): text rendered in Flash, swapped in at runtime over the real HTML. It worked, sort of, but was heavy, brittle, and inaccessible.
Cufón: a JavaScript trick that converted fonts into vector graphics and injected them into pages. No Flash required, but still slow and inaccessible.

These were desperate attempts to break out of the web-safe ecosystem, but they cemented the idea that custom typography was always going to be fragile, heavy, and hacky.

The arrival of @font-face

Then came @font-face. In theory, it let you serve any typeface you wanted, embedded straight into your CSS. In practice, it was a mess:

Different browsers required different, often proprietary formats: EOT (Embedded OpenType) for Internet Explorer, SVG fonts for early iOS Safari, raw TTF/OTF elsewhere.
Developers built “bulletproof” @font-face stacks – verbose CSS rules pointing to four different file formats just to cover every browser.
Licensing was a nightmare: many foundries banned web embedding or charged per-domain/pageview royalties.
Piracy was rampant, with ripped desktop fonts dumped online as “webfonts”.

Commercial services: Typekit and friends

Recognising the mess, commercial services stepped in. Typekit (launched 2009, now Adobe Fonts, and just as awful) offered subscription-based, legally licensed, properly formatted webfonts with a simple embed script. Other foundries built their own hosted services.

Typekit solved licensing and compatibility headaches for many teams, but it also entrenched the idea that fonts should load via third-party JavaScript snippets – a pattern that persists on millions of sites today.

Compatibility hacks and workarounds

Even with @font-face and services like Typekit, the webfont era was littered with workarounds:

Hosting multiple formats of the same font, bloating payloads.
Shipping fonts with whole Unicode ranges bundled “just in case”.
Battling FOUT (Flash of Unstyled Text) vs FOIT (Flash of Invisible Text), often with ugly JavaScript “fixes”.
Leaning on icon fonts to cover missing glyphs and UI symbols.

A whole generation of developers learned fonts as fragile, bloated, and temperamental – lessons that still echo in today’s bad practices.

Google Fonts and the “free font” boom

In 2010, Google Fonts arrived. Suddenly, there was a free, easy CDN with a growing library of open-licensed fonts. Developers embraced it, designers tolerated it, and performance people grumbled but went along with it.

It solved a lot of problems (including licensing, formats, hosting, and CSS wrangling) but it also created new ones. Everyone defaulted to it, even when they shouldn’t. Fonts started loading from a third-party CDN on every pageview, often slowly, and sometimes even illegally (as European courts would later decide).

An aside: Licensing realities

Licensing is the quiet trap in many font strategies. Not every “webfont license” lets you do what this article recommends. Some foundries:

Prohibit subsetting or conversion to WOFF2.
Charge based on pageviews or monthly active users.
Restrict embedding to specific domains.

That’s why Google Fonts felt liberating: no lawyers. But commercial fonts often come with terms that make aggressive optimisation legally risky. If you’re paying for a brand font, read the contract – or negotiate it – before you start slicing and optimising. Some foundries even charge per pageview or per monthly active user, so aggressive optimisation could technically put you out of compliance if you don’t have the right license.

The myths that stuck

From these eras came a set of myths and bad habits that are still alive today:

That custom fonts are “free” and easy.
That it’s fine to ship a single, monolithic font file for every user, in every language.
That Google Fonts (or Typekit) is always the best option.
That typography is a design flourish, not a performance or accessibility concern.

Those assumptions made sense in 2005 or even 2010. They don’t today. But they still shape how most websites load fonts – which is why the state of web typography is such a mess.

How fonts work (the basics)

Before we start tearing down bad practices, we need a shared baseline. Fonts are deceptively simple – “just some CSS” – but under the hood, they’re a surprisingly complex part of the rendering pipeline. Understanding that pipeline explains why fonts so often go wrong.

Formats: from TTF to WOFF2

At heart, a font is a container of glyphs (shapes), tables (instructions, metrics, metadata), and sometimes extras (ligatures, alternate forms, emoji palettes). They come in one of the following formats:

TTF/OTF (TrueType/OpenType): desktop-oriented formats, heavy and not optimised for web transfer.
EOT: Internet Explorer’s proprietary format, thankfully extinct.
SVG fonts: an early hack for iOS Safari, nearly extinct.
WOFF (Web Open Font Format): a wrapper that compressed TTF/OTF for the web.
WOFF2: the modern default – smaller, faster, built on Brotli compression.

If you’re serving anything but WOFF2 today, you’re doing it wrong. For almost every project, WOFF2 is all you need. Unless you have a specific business case (like IE11 on a locked-down enterprise intranet), serving older formats just makes every visitor pay a performance tax. If you absolutely must support a legacy browser, add WOFF as a conditional fallback – but don’t ship it to everyone.

The rendering pipeline

When a browser “loads a font,” it isn’t just a straight line from CSS to pixels. Multiple stages (and your CSS choices) dictate how text behaves.

Registration: As the browser parses CSS, each @font-face rule is registered in a font set – essentially a catalogue of families, weights, styles, stretches, and unicode-ranges. At this stage, no files are downloaded.
Style resolution: The cascade runs. Each element ends up with a computed font-family, font-weight, font-style, and font-stretch. The browser compares that against the registered font set to see what could be used.
Font matching: The font-matching algorithm looks for the closest available face. If the requested weight or style doesn’t exist, the browser may synthesise it (fake bold/italic) or fall back to a generic serif/sans/monospace.
Glyph coverage: Fonts are only queued for download if the text actually requires glyphs from them. If a unicode-range excludes the characters on the page, the font may never load at all.
Request: Once needed, the font request is queued. If @font-face rules are buried in a late-loading stylesheet, this can happen surprisingly late in the render cycle. Preload or inline to avoid the lag.
Display phase: While waiting for the font to arrive, the browser decides how to handle text – this is where font-display matters:
- No explicit setting (old default): Historically inconsistent. Safari often hid text entirely until the font arrived (FOIT), while Chrome showed fallback text immediately (FOUT). This inconsistency fuelled years of bad hacks.
- font-display: swap; Renders fallback text immediately, then swaps to the webfont when ready (FOUT).
- font-display: block; Hides text for up to ~3s (FOIT), then shows fallback if still not ready.
- font-display: fallback; Very short block (~100ms), then fallback shows. Font swaps later if it arrives.
- font-display: optional; Shows fallback immediately and may never swap if conditions are poor.
Decoding and shaping: Once downloaded, the font is decompressed, parsed, and shaped (OpenType features applied, ligatures resolved, contextual forms chosen). Only then can glyphs be rasterised and painted. On low-end devices, this shaping step can add noticeable delay.

All of this happens under the hood before a single glyph hits the screen. Developers can’t change how a shaping engine works – but they can influence what happens afterwards. The next piece of the puzzle is metrics: how tall, wide, and spaced your text appears, and how to stop those dimensions from shifting when fonts swap in.

Metrics

Fonts don’t just define glyph shapes. They also define metrics:

Ascent, descent, line gap – how tall lines are, where baselines sit.
x‑height – how big lowercase letters appear.
Kerning and ligatures – how characters fit together.

If your fallback system font has different metrics, the page will render one way during FOUT, then “jump” when the custom font loads. That’s not just ugly – it’s measurable layout shift, and it can tank your Core Web Vitals.

We’ll explore how to tinker with these values later in the post.

Synthesised styles

When the browser can’t find the exact weight or style you’ve asked for, it doesn’t just give up. It fakes it:

Fake bolding: If you request font-weight: 600 but only have a 400 (regular), most browsers will thicken the strokes algorithmically. The result often looks clumsy, inconsistent, and can ruin brand typography.
Fake italics: If you request font-style: italic without having a true italic face, the browser simply slants the regular glyphs. It’s a cheap trick, and typographically awful.

That “helpfulness” can make your typography look sloppy, and it can throw off spacing/metrics in subtle ways. The fix:

Only declare weights/styles you actually provide.
Use font-synthesis: none; to prevent browsers from faking bold/italic.
Subset/serve the actual weights you need – and stop at the ones you’ll really use.

One more layer: once fetched, fonts aren’t just “ready.” Browsers must decode, shape, and rasterize them. That means parsing OpenType tables, applying shaping rules (HarfBuzz, CoreText, DirectWrite), and rasterising glyphs to pixels. On low-end devices, this step can take measurable milliseconds. Font choice isn’t just about bytes on the wire – it’s also about CPU cycles at paint time.

Glyph coverage

Finally, fonts aren’t universal. A Latin font may not contain accented characters, Cyrillic glyphs, Arabic ligatures, or emoji. When a glyph is missing, the browser silently switches to a fallback font to cover that code point. The result can be inconsistent rendering, mismatched sizing, or even boxes and question marks.

This is why subsetting matters, why fallback stacks matter, and why understanding coverage is essential.

So: fonts aren’t just “download this file and it works.” They’re complex, heavy, and integral to how the browser paints text. Which is exactly why treating them as decoration – instead of as infrastructure – is such a bad idea.

Performance & strategy fundamentals

If the history explains why fonts are messy, the performance reality explains why they matter. Fonts aren’t just a design choice – they’re part of your critical rendering path, and they can make or break your Core Web Vitals.

File size

Most websites are serving far too much font data. A single “complete” font family can easily be 400–800 KB per style. Add bold, italic, and a few weights, and suddenly you’re shipping multiple megabytes of font data before your content is even legible. That’s more than many sites spend on JavaScript.

And the kicker? Most of those glyphs and weights are never used.

Layout shift

Fonts don’t just block rendering; they actively cause reflows when they arrive.

If your fallback font has different metrics (x‑height, ascent, descent, line-gap), your content will jump when the webfont loads.
That’s measurable Cumulative Layout Shift (CLS), and it directly impacts Core Web Vitals.

The good news: modern CSS gives us the tools to fix all of this.

Modern CSS descriptors (and what they actually do)

font-display – controls what happens while the font is loading.

swap: show fallback immediately, swap to webfont when ready (FOUT). Good default.
fallback: tiny block (~100ms), then fallback; swap later. Safer on poor networks.
optional: show fallback, may never swap. Great for decorative fonts.
block: hide text for a while (≈3s). Looks “clean” on fast, awful on slow. Avoid.

👉 This is your first-paint policy. Choose carefully.

Metrics override descriptors – make fallback and webfont metrics match.

These live inside @font-face. They tell the browser: “scale and align this webfont so it behaves like the fallback you showed first.” That way, when the swap happens, nothing jumps.

size-adjust: scales the webfont so its perceived x‑height matches the fallback.
ascent-override / descent-override: align baselines and descender space.
line-gap-override: controls extra line spacing to keep paragraphs steady.

Example:

@font-face {
  font-family: 'Brand';
  src: url('/fonts/brand.woff2') format('woff2');
  font-display: swap;                 /* first paint policy */
  size-adjust: 102%;                  /* match fallback x-height */
  ascent-override: 92%;               /* align baseline */
  descent-override: 8%;               /* balance descenders */
  line-gap-override: normal;          /* stabilise line height */
}

In practice, you can use tools like Font Style Matcher to calculate the right values. These help you match fallback and custom font metrics precisely and eliminate CLS .

unicode-range – serve only the glyphs a page actually needs.

Declare separate @font-face blocks for each subset (Latin, Latin-Extended, Cyrillic, etc.). The browser only requests the ones it needs.

Example:

@font-face {
  font-family: "Brand";
  src: url("/fonts/brand-latin.woff2") format("woff2");
  unicode-range: U+0000-00FF, U+0131, U+0152-0153;
  font-display: swap;
}

👉 Saves hundreds of kilobytes by not shipping glyphs for scripts you’ll never use.

font-size-adjust – property for elements (not @font-face).

Scales fallback fonts so their x‑height ratio matches the intended font. Prevents fallback text from looking too small or too tall.

Example:

html { font-size-adjust: 0.5; } /* ratio matched to your brand font */

From descriptors to strategy

These CSS descriptors are your scalpel: precise tools for cutting out CLS and wasted payload. But solving font performance isn’t just about fine-tuning metrics; it’s about making the right high-level choices in how you ship, scope, and prioritise fonts in the first place.

Language coverage and subsetting

A huge but often overlooked opportunity is language coverage and subsetting.

Most sites only need Latin or Latin Extended, yet many ship fonts containing Cyrillic, Greek, Arabic, or full CJK sets they’ll never use. That’s hundreds of kilobytes – sometimes megabytes – wasted on every visitor.

Smarter strategy:

Subset fonts with tools like fonttools, Glyphhanger, or Subfont.
Use unicode-range to declare subsets per script.
Build locale-specific bundles (e.g. fonts-en.css, fonts-ar.css) for internationalised sites.

That way, browsers will only download subsets if they’re needed – so a Cyrillic user gets Cyrillic, a Latin user gets Latin, and nobody pays for both.

When not to subset

⚠️ For sites with genuine multilingual needs, especially across non-Latin scripts, stripping glyphs can do more harm than good. Arabic, Hebrew, Thai, and Indic scripts rely on shaping and positioning tables (GSUB/GPOS). We’ll explore this later.

⚠️ And if your site has a lot of user-generated content, be conservative. Users will surprise you with stray Greek, Cyrillic, or emoji. In those cases, lean on broader coverage or robust system fallbacks rather than slicing too aggressively.

Lazy-loading non-critical fonts

Not every font has to be part of the critical rendering path. Headline display fonts, decorative typefaces, and icon sets (e.g., for social media icons that only appear in the footer) often aren’t essential to that first paint. These can be deferred or staged in later, once the core content is visible.

Two reliable approaches:

Use the Font Loading API (document.fonts.load) to request and apply them after the page is stable.
Or set font-display: optional, which tells the browser the fallback is fine – and if the custom font arrives late (or never), the page still works.

This keeps the focus on performance where it matters most: content-first, aesthetics second.

Fonts as progressive enhancement

At the end of the day, fonts should be treated as progressive enhancement. Your site should load quickly, render legibly, and remain usable even if a custom font never arrives. A well-chosen system fallback ensures content-first delivery, while the webfont (if, or when, it loads) adds polish and brand identity.

Typography matters, but it should never get in the way of reading, speed, or stability.

Variable fonts: promise vs reality

If subsetting and smart loading are the practical fixes, variable fonts are the seductive promise. One font file, infinite possibilities. The idea is compelling: instead of shipping a dozen separate files for regular, bold, italic, condensed, and wide, you just ship one variable font that can flex along those axes.

And in theory, that means less to download, finer design control, and a more responsive, fluid typographic system.

The promise

Consolidation: collapse dozens of static files into a single resource.
Precision: use exact weights (512, 537…) instead of stepping through 400/500/600.
Responsiveness: unlock width and optical size axes that adjust seamlessly across breakpoints.
Consistency: fewer moving parts, cleaner CSS, and potentially smaller payloads.

The reality

Variable fonts are brilliant – but not a magic bullet.

File size creep: if you only need two weights, a variable file may actually be larger than two well-subset static fonts.
Browser support quirks: weight interpolation is universal, but some axes (optical sizing, italic, grade) are patchy across browsers.
Double-loading traps: many teams ship a variable font and static files “just in case,” which cancels out the benefits.
Licensing headaches: some foundries sell or license variable fonts separately, or prohibit modifications like subsetting.
Support quirks: core axes like weight, width, slant, and optical size are now universally supported. But custom axes (like grade) still require font-variation-settings and may not have CSS shorthands. So don’t assume every axis is ergonomic across browsers.

Performance strategy

Treat variable fonts like any other asset: audit, measure, and subset.

Pick your axes carefully: do you really need width, optical size, or italics?
Subset by script just as you would with static fonts; don’t ship the whole world.
Benchmark payloads: check whether one variable file actually saves over two or three statics.

Design strategy

When used deliberately, variable fonts unlock design latitude you simply can’t get otherwise.

Responsive typography: scale weight or width subtly as the viewport changes.
Optical sizing: automatically adjust letterforms for legibility at small vs large sizes.
Brand expression: interpolate between styles for more personality than a static set.

But use restraint. Animating font-variation-settings may look slick in demos, but it often janks in practice.

Example: using variable font axes in CSS

/* Load a variable font */
@font-face {
  font-family: "Acme Variable";
  src: url("/fonts/acme-variable.woff2") format("woff2-variations");
  font-weight: 100 900;        /* declares supported weight range */
  font-stretch: 75% 125%;      /* declares supported width range */
  font-style: normal italic;   /* declares upright and slanted */
  font-display: swap;
}

/* Use weight and width as normal */
h1 {
  font-family: "Acme Variable", system-ui, sans-serif;
  font-weight: 700;    /* resolves within the declared 100–900 range */
  font-stretch: 110%;  /* slightly wider */
}

/* Non-standard axes via font-variation-settings */
.hero-text {
  font-family: "Acme Variable", system-ui, sans-serif;
  font-variation-settings: "wght" 500, "wdth" 120, "slnt" -5;
}

/* Responsive fluid typography: adjust weight with viewport size */
h2 {
  font-family: "Acme Variable", system-ui, sans-serif;
  font-weight: clamp(400, 2vw + 300, 700);
}

👉 This shows both the “semantic” way (with font-weight, font-stretch, font-style) and the raw font-variation-settings way for full control.

Best practice

Start with a needs audit: what weights, styles, and scripts do you actually use?
If variable fonts win on size and coverage, great – deploy them with subsetting and unicode-range.
If two or three statics are leaner and simpler, stick with them.

Variable fonts are a tool, not a default. The key is to be deliberate: weigh the trade-offs, and implement them with the same discipline you’d apply to any other part of your performance budget.

System stacks and CDNs

Not every project needs custom fonts. In fact, one of the most powerful performance wins is simply not loading them at all.

System font stacks

System fonts – the ones already bundled with the OS – are free, instant, and familiar. The trick is in choosing a system stack that feels cohesive across platforms. A typical modern stack looks like this:

body {
  font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto,
               Helvetica, Arial, sans-serif, "Apple Color Emoji",
               "Segoe UI Emoji", "Segoe UI Symbol";
}

This cascades through macOS, iOS, Windows, Android, Linux, and falls back cleanly to web-safe sans-serifs. For body text, navigation, and utilitarian UI elements, system stacks are hard to beat.

They’re also excellent fallbacks: even if you do load custom fonts, designing around the system stack first guarantees legibility and resilience.

It’s also worth noting that system fonts almost always handle emoji better – lighter weight, more coverage, and more consistent rendering than trying to ship emoji glyphs in a webfont. We’ll explore emojis this in more detail later.

CDNs and third-party hosting

For years, Google Fonts was the default solution: paste a <link> into your <head> and you were done. But today that’s a bad trade-off.

Privacy: loading fonts from Google Fonts leaks visitor data to Google. Regulators (especially in Europe) have judged this a GDPR violation.
Performance: third-party CDNs add latency, DNS lookups, and potential blocking. In most cases, self-hosting is faster and more reliable.
Caching myths: the old argument that “Google Fonts are already cached” simply isn’t true anymore. Modern browsers partition caches per site for privacy. A font fetched on site A won’t be reused on site B. In practice, each site (and the user) pays the cost independently.

Best practice is simple: self-host your fonts. Download them from the foundry (or even from Google Fonts), serve them from your own domain, and control headers, preloading, and caching yourself.

But even if you self-host and optimise your fonts, what users see first isn’t your brand font – it’s the fallback. That’s where the real user experience lives.

Fallbacks and matching

Loading fonts isn’t just about the primary choice – it’s also about how gracefully the design holds up before and if the custom font arrives. That’s where fallbacks matter.

Designing with fallbacks in mind

A fallback font isn’t just an emergency plan – it’s a baseline your visitors might actually see, even if only for a few milliseconds. That makes it worth designing for. A good fallback:

Matches the x‑height and letter width of your primary font closely enough that layout shifts are minimal.
Feels stylistically compatible: if your brand font is a geometric sans, pick a system sans, not Times New Roman.
Includes emoji, symbols, and ligatures that your primary font may lack.

Tuning fallbacks with modern CSS

We covered font-size-adjust and font-optical-sizing in Section 4, but the key is their application here: you can actually tune your fallback stack to minimise visible shifts.

For example, if your fallback font has a smaller x‑height, you can bump its size slightly using font-size-adjust so that text aligns more closely with your custom font when it swaps in.

body {
  font-family: -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, sans-serif;
  font-size-adjust: 0.52; /* Match x-height ratio of custom font */
}

This avoids the infamous “jump” when the real font finishes loading.

Matching custom and fallback fonts

The end goal isn’t perfection, it’s stability. You won’t get Helvetica Neue to perfectly mirror Segoe UI, but you can:

Choose fallbacks with similar proportions.
Adjust size/line-height to reduce reflow.
Use variable font axes (when available) to more closely approximate your fallback’s look at initial render.

The better your fallback, the less anyone notices when your custom font finally kicks in.

And remember: rendering engines differ. Windows ClearType, macOS CoreText, and Linux FreeType all anti-alias and hint fonts differently. Chasing pixel-perfect consistency across platforms is a lost cause; stability and legibility matter more than identical rendering.

Preloading and loading strategies

Even with the right fonts, subsets, and fallbacks, the delivery strategy can make or break the user experience. A beautiful font served late is still a broken experience.

The alphabet soup of loading outcomes

Most developers have at least heard of FOIT and FOUT, but rarely think about how deliberate choices (or lack thereof) cause them.

FOIT (Flash of Invisible Text): text is hidden until the custom font loads. Looks sleek when it works fast, looks catastrophic on slow networks.
FOUT (Flash of Unstyled Text): fallback text renders first, then switches when the custom font arrives. Stable, but potentially jarring.
FOFT (Flash of Faux Text): a messy hybrid where a browser synthesises weight/italic, then swaps to the real cut. Distracting and ugly.

The browser’s defaults – and your CSS – determine which outcome users see.

font-display

The font-display descriptor is the blunt instrument for influencing this:

swap: show fallback immediately, swap when ready (the safe modern default).
block: hide text (FOIT) for up to 3s, then fallback. Dangerous.
fallback: like swap, but gives the real font less time to load.
optional: load only if the font is already fast/cached. Good for non-critical assets.

Most sites should default to swap. Don’t leave it undefined.

Preloading fonts

Preload is the sharp tool. Adding:

<link rel="preload" as="font" type="font/woff2" crossorigin
      href="/fonts/brand-regular.woff2">

…tells the browser to fetch the font immediately, rather than waiting until it encounters the @font-face rule in CSS. This is especially valuable if you inline your @font-face declarations in the <head> (as you should) – otherwise fonts often load after layout and render have already begun.

⚠️ Be selective when preloading: if you’ve split fonts into subsets with unicode-range, only preload the subset you know is needed for initial content. Preloading every subset defeats the purpose by forcing them all to download, even if not used.

Preload gotchas (worth your time)

Match everything: your preload URL must exactly match the @font-face src (path, querystring), and the response must include the right CORS header (Access-Control-Allow-Origin), and your <link> must carry crossorigin. If any of those disagree, the preload won’t be reused by the actual font load. <link rel="preload" as="font" type="font/woff2" href="/fonts/brand-regular.woff2" crossorigin>
Use the right as/type: as="font" and a correct MIME hint (type="font/woff2") influence prioritization and help browsers coalesce requests. Wrong/missing values can cause the preload to be ignored.
Don’t preload everything: if you’ve split by unicode-range (e.g., Latin, Cyrillic), preload only the subset you’ll actually paint above the fold. Preloading every subset forces downloads and defeats subsetting.
“Preload used late” warnings: browsers will warn if a preloaded resource isn’t used shortly after navigation. That’s usually a smell (wrong URL, late‑discovered @font-face, or you preloaded a non‑critical face).
Service Worker synergy: if you run a SW, pre‑cache WOFF2 at install. First‑hit uses preload; subsequent hits come from SW in ~0ms.

Inline vs buried @font-face

This is an easy win that almost nobody takes. If your @font-face lives in an external CSS file, the browser won’t even discover the font until that file is downloaded, parsed, and executed. Inline it in the <head> and preload the asset, and you’ve cut an entire round trip out of the waterfall.

But – there are caveats.

If you already ship a single, render-blocking stylesheet early: inlining doesn’t buy you much. The browser was going to see those @font-face rules quickly anyway, and it still won’t request the font until the text that needs it is on screen. The browser will also wait until that render-blocking CSS is executed in case it overrides the font. In that setup, preload is what really makes the difference.
If your CSS arrives late or piecemeal – critical CSS inline, async or route-level styles, CSS-in-JS, @import, SPA hydration – then inlining can be genuinely useful. It ensures fonts are discovered immediately, not halfway through page render. In those cases, it’s an under-used safeguard.

So: inlining plus preload can be a neat win, especially on modern, fragmented architectures. But if it makes a dramatic difference to your site, that’s also a signal that your CSS delivery strategy might need fixing.

Early Hints (HTTP 103)

Even preloading has limits – the browser still has to parse enough of the HTML to see your <link rel="preload"> (or, wait for the HTTP response to get the equivalent header). If your server or network is slow, that might take quite some time.

With Early Hints (HTTP status 103), the server can tell the browser immediately which critical assets to start fetching – before the main HTML response is delivered.

That means your fonts can be on the wire during the first round trip, rather than waiting for HTML parsing.

HTTP/1.1 103 Early Hints
Link: </fonts/brand-regular.woff2>; rel=preload; as=font; type="font/woff2"; crossorigin

Things to bear in mind:

Coalesce with HTML Link preloads: it’s fine to hint the same font in 103 and again in the final 200 via a Link header/HTML tag (as modern browsers dedupe). Don’t rely on intermediaries though; some proxies still drop 103s. Keep the HTML/200 fallback preload.
Manage CORS in the hint: include crossorigin in the 103 Link, so that he early request is eligible for reuse by the @font-face.
Be choosy: only hint critical above‑the‑fold faces/weights. Over‑hinting competes with HTML/CSS and can slow TTFB in practice.

Support is growing across servers, CDNs, and browsers. If you’re already preloading fonts, adding Early Hints is a straightforward way to shave another few hundred milliseconds off time-to-text.

⚠️ Don’t go wild: only hint fonts you know are needed above the fold. Over-hinting can waste bandwidth and compete with more critical assets.

Don’t use `@import`

One of the worst mistakes is loading fonts (or CSS that declares them) via @import. Every @import is another round trip: the browser fetches the parent CSS, parses it, then discovers it needs another CSS file, then discovers the @font-face… and only then requests the font.

That means your text can’t render until the slowest possible path has played out.

Best practice is simple: never use @import for fonts. Always declare @font-face in a stylesheet the browser sees as early as possible, ideally inlined in the <head> with a preload.

Strategic trade-offs

Critical fonts (body, navigation): preload + font-display: swap.
Secondary fonts (headlines, accents): preload only if they’re above the fold, otherwise lazy-load.
Decorative fonts: consider optional or defer entirely.

Loading strategy isn’t about dogma (“always swap” vs “always block”) – it’s about choosing the least worst compromise for your audience. The difference between text that renders instantly and text that lags behind is the difference between a user staying or bouncing.

The font loading API

For fine-grained control, the Font Loading API gives you promises to detect when fonts are ready and orchestrate swaps. But in practice, it’s rarely necessary unless you’re building a highly dynamic or JS-heavy site, but it’s useful to know it exists.

File formats: WOFF2, WOFF, TTF, and the legacy baggage

The font world is littered with old formats, half-truths, and cargo-cult practices. A lot of sites are still serving fonts like it’s 2012 – shipping multiple redundant formats and bloating their payloads.

WOFF2: the modern default

If you take only one thing away from this section: serve WOFF2, and almost nothing else.

It’s the most efficient web format, compressing smaller than WOFF or TTF.
It’s universally supported in all modern browsers.
It can contain full OpenType tables, variations, and modern features.

For the vast majority of projects, WOFF2 is all you need. Unless you have an explicit business case for IE11 or very old Android builds, there’s no reason to ship anything else. Legacy compatibility can’t justify making every visitor pay a performance tax.

One caveat: some CDNs try to Brotli-compress WOFF2 files again, even though they’re already Brotli-encoded. That wastes CPU cycles for no gain. Make sure your pipeline serves WOFF2 as-is.

WOFF: the fallback you probably don’t need

WOFF was designed as a web-optimised wrapper around TTF/OTF. Today it’s only relevant if you absolutely must support a very old browser (think IE11 in corporate intranets). In public web contexts, it’s dead weight.

TTF/OTF: desktop-first relics

TrueType (TTF) and OpenType (OTF) fonts are great for design tools and local installs, but shipping them directly to browsers is wasteful. They’re larger, slower to parse, and in some cases reveal more metadata than you want to serve publicly.

If your build pipeline still spits out .ttf for the web, it’s time to modernise.

SVG fonts: just… no

Once upon a time, SVG-in-font was a hack to get colour glyphs (like emoji) into the browser. That era is gone. Modern emoji and colour fonts use COLR/CPAL or CBDT/CBLC tables inside OpenType/WOFF2. If you see SVG fonts in your stack, delete them with fire.

Base64 embedding

Every so often, someone still tries to inline fonts as base64 blobs in CSS. Don’t. It bloats CSS files, breaks caching, and blocks parallelisation. Fonts are heavy assets that deserve their own requests and their own cache headers.

Do you need multiple formats?

No. Not unless your business case genuinely includes “must support IE11 and Android 4.x, and absolutely cannot live with fallback system fonts” For everyone else:

WOFF2 only
Self-hosted
Preloaded and cached properly

That’s it. And once you serve WOFF2, serve it well: give font files a long cache lifetime (months or a year) and use versioned file names for cache busting when fonts change. Fonts rarely update, so they should almost always come from cache on repeat visits. NB, see my post about caching for more tips here.

Legacy formats are ballast. If you’re still serving them, you’re making every visitor pay the price for browsers that nobody uses anymore.

Icon fonts: Font Awesome and the great mistake

Once upon a time, icon fonts felt clever. Pack a bunch of glyphs into a font file, assign them to letters, and voilà – scalable, CSS-stylable icons. Font Awesome, Ionicons, Bootstrap’s Glyphicon set… they were everywhere.

But it was always a hack. And in 2025, it’s indefensible.

The fundamental problems with icon fonts

Accessibility: Screen readers announce “private use” characters as gibberish, because there’s no semantic meaning.
Fragility: If the font fails to load, users see meaningless squares or fallback letters.
Styling hacks: Matching line-height, alignment, and sizing was always fragile.
Performance: You end up shipping an entire font file (often hundreds of unused icons) just to use a handful.

Better alternatives

Inline SVGs: semantic, flexible, styleable with CSS.
SVG sprites: cacheable, easy to swap or reference by ID.
Icon components (React/Vue/etc): imported on demand, tree-shakeable.
CSS mask-image / -webkit-mask-image: a neat option when you want a vector shape as a pure CSS-driven mask (e.g. colourising icons dynamically).

“But I already use Font Awesome…”

If you’re stuck with an icon font, there are two urgent things you should do:

Subset it so you’re not shipping 700 icons to render 7.
Plan your migration – usually to SVG. Most modern icon sets (including Font Awesome itself) now offer SVG-based alternatives.

The lingering myth

People cling to icon fonts because they “just work everywhere.” That used to be true. But today, SVG has universal support, better semantics, and better tooling.

Icon fonts are like using tables for layout – a clever hack in their day, but a mistake we shouldn’t still be repeating.

Beyond Latin: Non-Latin scripts, RTL languages, and emoji

If icon fonts were a hack born from a lack of glyph coverage, global typography is the opposite problem: too many glyphs, too many scripts, and too much complexity.

It’s easy to optimise fonts if you’re only thinking in English. But the web isn’t just Latin letters, and many of the “best practices” break down once you step into other scripts.

Non-Latin scripts

Arabic, Devanagari, Thai, and many others are far more complex than Latin. They rely on shaping engines, ligatures, and contextual forms. Subsetting recklessly can break whole words, turning live text into nonsense.

Don’t subset blindly. Many scripts need entire blocks intact to render correctly.
Test across OSes. Some scripts have wildly different default fallback behaviour depending on platform.
Expect heavier fonts. A full-featured CJK font can easily be 5–10MB before optimisation. In those cases, variable fonts or progressive loading are even more critical.

RTL languages

Right-to-left scripts like Arabic and Hebrew aren’t just flipped text. They come with:

Different punctuation and digit shaping.
Directional controls (bidi) that interact with your markup and CSS.
Font metrics that can differ significantly from Latin-based fallbacks.

Your fallback stack needs to understand RTL – not just render mirrored Latin glyphs. Always test with real RTL content.

Emoji

Emoji are a special case. Nobody should be shipping emoji glyphs in a webfont. They’re heavy, inconsistent, and outdated as soon as the Unicode consortium adds new ones.

Best practice is simple:

Use the system’s native emoji font (Apple Color Emoji, Segoe UI Emoji, Noto Color Emoji, etc).
Include them in your system stack, usually after your primary fonts: font-family: "YourBrandFont", -apple-system, BlinkMacSystemFont, "Segoe UI", Roboto, "Noto Color Emoji", "Apple Color Emoji", sans-serif;
Accept that emoji will look different on different platforms. That’s the web.

Designing for global text

If your brand works internationally, test with:

Mixed scripts (English + Arabic, or Chinese + emoji).
Platform differences (Android vs iOS vs Windows).
Fallback handling when your chosen font doesn’t cover the script.

Global typography isn’t just about coverage – it’s about resilience. Your font strategy should assume diversity, not break under it.

The future of webfonts: evolving standards and modern risks

We’ve covered the history and the present. But the font story isn’t finished – new CSS specs, new browser behaviours, and new app architectures are all shaping what “best practice” will look like over the next few years.

Upcoming CSS and font tech

There’s a steady stream of new descriptors and properties landing across specs and browsers:

font-palette: lets you switch or customise colour palettes inside COLR/CPAL colour fonts.
font-synthesis: controls whether the browser is allowed to fake bold/italic styles (finally giving you a “no thanks” switch).
size-adjust (expanding on font-size-adjust): more granular tuning of fallback alignment.
Incremental font transfer (IFT): a still-emerging approach where browsers can fetch only the glyphs a page needs, progressively, instead of downloading the full file.

None of these are mainstream defaults yet, but they point towards a more controlled, nuanced future.

Risks in JS-heavy websites and SPAs

New standards are exciting, but real-world implementation often collides with how sites are actually built today. And the reality is: the modern web is dominated by JavaScript-heavy, framework-driven applications. That changes the font-loading landscape.

Modern JavaScript frameworks (React, Vue, Angular, Next, etc.) introduce new challenges for font loading:

Fonts triggered late: if your routes/components lazy-load CSS, fonts may only be requested after hydration, leading to jank.
Critical CSS extraction gone wrong: automated tooling sometimes misses @font-face rules, breaking preload chains.
Client-side routing: navigating between views might trigger new font loads that weren’t preloaded up front.
Font Loading API misuse: some SPAs try to orchestrate font loading manually and end up delaying it unnecessarily.

Best practice here is simple: treat fonts as application-critical assets, not just “another stylesheet.” Preload and inline your declarations early, and test your routes for late font requests.

The trend towards variable and colour fonts

Variable fonts are becoming the expectation rather than the exception, and colour/emoji fonts are mainstreaming. That means:

Your font strategy needs to handle richer files and more axes of variation.
Subsetting and loading strategies matter even more as file sizes grow.
Expect to see more sites using a single highly flexible variable font, instead of juggling multiple static weights.

The cultural shift

For years, fonts were treated as decorative – a flourish bolted on at the end of a build. The future demands the opposite: treating typography as infrastructure. Performance budgets, accessibility standards, and internationalisation all hinge on doing fonts properly.

The webfont ecosystem is still maturing. If the last decade was about getting fonts to work at all, the next decade will be about making them efficient, predictable, and global.

But optimism and theory don’t mean much without proof. Fonts need measurement, not just faith – which is where tooling comes in.

Tooling and auditing

The nice thing about tinkering with fonts is that your decisions and performance are measurable. If you want to know whether your setup is efficient (and beautiful – or, at least, on brand), everything is testable. You can use:

DevTools: Simulate “Slow 3G, empty cache” in Chrome/Edge to see whether text is invisible, unstyled, or jumping. Watch the waterfall to confirm when fonts start downloading.
WebPageTest / Lighthouse: Both expose font request timing, blocking resources, and CLS caused by late swaps.
Glyphhanger / Subfont: CLI tools that analyse which glyphs your site actually uses, and generate subsets automatically.
Fonttools (pyftsubset): The Swiss Army knife for professional font subsetting and editing.
CI checks: Set budgets (e.g. no more than 3 weights, no font over 200 KB compressed).
Transfonter for generating optimised font files and CSS.
Font Style Matcher: For configuring fallback font metrics to match your custom font.

The golden rule: if you’ve never tested your fonts under a cold-cache, slow-network condition, you don’t know how your site actually behaves.

A manifesto for doing fonts properly

Webfonts are not decoration. They shape usability, performance, accessibility, and even legality. Yet most of the web still treats them as an afterthought – bolting them on late, bloating them with legacy baggage, and breaking the user experience for something as basic as text.

It doesn’t have to be this way. Handling fonts properly is straightforward once you treat them with the same seriousness you treat JavaScript, caching, or analytics.

The principles:

System-first: Start with a robust system stack. Custom fonts are progressive enhancement, not a crutch.
Subset aggressively (but intelligently): Ship only what users need, and test in the languages you support.
Preload and inline: Don’t bury critical @font-face rules or delay requests.
WOFF2 only (in 99% of cases): Drop the ballast of legacy formats.
SVG for icons: Leave icon fonts in the past where they belong.
Variable fonts when they add value: One flexible file beats a family of static weights.
Design your fallbacks: Tune metrics so your system stack doesn’t break the layout.
Respect global scripts: Optimise differently for Arabic, CJK, RTL, and emoji.
Test like it matters: Different devices, different networks, different locales.

This isn’t about chasing purity or obscure micro-optimisations. It’s about building a web that renders fast, looks good, and works everywhere.

Fonts are content. Fonts are brand. Fonts are user experience. If you’re still treating them as “just another asset,” you’re loading them wrong.

The post You’re loading fonts wrong (and it’s crippling your performance) appeared first on Jono Alderson.

On propaganda, perception, and reputation hacking

Jono Alderson

Yazar:Jono Alderson

14 Ağustos 2025 saat 17:31

For the last two decades, SEO has been a battle for position. In the age of agentic AI, it becomes a battle for perception.

When an LLM – or whatever powers your future search interface – decides “who” is trustworthy, useful, or relevant, it isn’t weighing an objective truth. It’s synthesising a reality from fragments of information, patterns in human behaviour, and historical residue. Once the model holds a view, it tends to repeat and reinforce it.

That’s propaganda - and the challenge is ensuring the reality the machine constructs reflects you at your best.

Two ideas help navigate this.

Perception Engineering is the long game: shaping what machines “know” over time by influencing the enduring sources and narratives they ingest.
Reputation Hacking is the nimble, situational work of influencing or correcting narratives in the moment.

Both are forms of propaganda in the machine age – not the crude, deceptive kind, but the careful, factual shaping of how your story is told and retold.

And both matter – because the future of discovery is dynamic and adaptive, but the raw material is often sluggish. And that persistence – what sticks, what lingers, what gets repeated – is where most of the opportunity (and risk) lives.

This brings us to the core difference between human and machine narratives: how they remember. In this game, memory isn’t a passive archive – it’s an active filter, deciding what survives and how it’s retold. Get into the memory for the right reasons, and you can ride the benefits for years; get in for the wrong ones, and the shadow can be just as long.

The perpetual memory of machines

Humans forget. Machines don’t – at least, not in the same way.

When we forget, the edges blur. Details fade, timelines collapse, and the story becomes softer with distance. Machines, by contrast, don’t lose the data; they distil it. Over time, sprawling narratives are boiled down into their most distinctive fragments. That’s why brand histories are rarely undone by a single correction: they’re retold and re-framed until only the high-contrast bits remain.

Models are especially good at this kind of distillation – the scandal becomes the headline; the resolution is relegated to a footnote. In human propaganda, repetition does the work; in machine propaganda, compression and persistence do.

And because the compressed version often becomes the only version a machine recalls at speed, understanding how that memory is formed is crucial.

Two kinds of memory matter here:

Training memory: whatever was in the data during the last snapshot. If it was high-profile, repeated, or cited, it picks up “institutional gravity” and is hard to dislodge.
Retrieval memory: whatever your agent fetches at runtime – news, documents, databases – and the guardrails that steer how it’s used.

Time decay helps, sometimes. Many systems down-weight stale material so answers feel current. But it’s imperfect. High-visibility events keep their gravity, and low-visibility corrections don’t always get the same reach.

There’s also the “lingering association” problem: co-reference (“old name a.k.a. new name”) or categorical links (“company X, part of scandal Y”) keep the old framing alive in perpetuity. In human terms, it’s like being introduced at a party with a two-year-old anecdote you’d rather forget.

The point isn’t that machines never forget – it’s that they forget selectively, in ways that don’t automatically favour the most recent or most accurate version of your story.

Positive PR as a self-fulfilling loop

If memory can haunt, it can also help.

Language travels. In the best kind of propaganda, it’s the flattering, accurate turn of phrase that does the rounds. When a respected outlet coins one, it doesn’t stay put.

It turns up in analyst notes, conference decks, product reviews, and investor briefings. The repetition turns it into a linguistic anchor – the default way to describe you, even for people who’ve never read the original.

Behaviour travels too. If people expect you to be good, they act accordingly: they search for you by name, click you first, stick around longer, and talk about you in more positive terms. None of that proves you’re the best, but it creates data patterns that make you look like the best to systems that learn from aggregate behaviour.

The loop is subtle: positive framing → positive behaviour → positive framing. It’s not instant, but once established, it can be self-reinforcing for years.

In this context, Perception Engineering is about identifying the phrases, framings, and narratives you’d want to see repeated indefinitely – and ensuring they originate in credible, durable sources. Reputation Hacking, on the other hand, is about spotting those moments in the wild – a conference panel soundbite, a glowing product comparison – and nudging them into places where they’ll be picked up, cited, and echoed.

The trick isn’t to plant advertising copy in disguise; it’s to seed clear, accurate, and repeatable language that works for you when it’s stripped of context and paraphrased by a machine.

The weaponisation of perception

Any system that can be shaped can be distorted. And in an environment where narrative persistence is the real prize, some will try.

Defensive propaganda starts with recognising the quiet ways bias enters the record: selective data, tendentious summaries, strategic omissions. These aren’t always illegal. They’re rarely obvious. But once embedded – especially in formats with long shelf lives – they can tilt the machine’s memory for years.

Weaponisation doesn’t have to look like a smear campaign. It can be as subtle as redefining a term in a trade publication, repeatedly pairing a competitor’s name with an unflattering comparison, or supplying an “expert quote” that’s technically accurate but engineered to leave the wrong impression. Even the order of information can create a lasting skew.

The danger isn’t only in outright falsehoods. Once a distortion is repeated and cited, it becomes part of the machine’s “truth set” – and because models reconcile contradictions into one coherent narrative, the detail they keep is often the one with the sharpest edge, not the one that’s most correct.

The countermeasure is simple, if not easy: make the accurate version so abundant, consistent, and easy to cite that it outweighs the distortion. If there’s going to be a gravitational centre, you want it to be yours.

We’ve seen shades of this in human media ecosystems for decades:

A decades-old product recall still mentioned in “history” sections long after the issue was resolved.
Industry rankings where the methodology favours one business model over another, subtly reshaping market perception.
Persistent category definitions that exclude certain players altogether, not because they’re irrelevant, but because the earliest, most visible framing said so.

Pretending this doesn’t happen is naïve. Copying it is reckless. The more sensible response is to raise the signal-to-noise ratio in your favour: make the accurate version abundant, consistent, and easy to cite. In other words, counter bad propaganda with better propaganda – a clear, consistent truth that’s hard to compress into anything less flattering.

The collapse of neutral search

Neutrality is a story we tell ourselves.

Agents don’t simply “retrieve facts”. They synthesise from priors, recency, safety layers, and whatever they can fetch. Even when they hedge (“some say”), they still decide which “some” count – and that decision shapes the story.

In the blue-link era, we optimised for ranking. In the agent era, we optimise for narrative selection: the frames, sources, and categories that get picked when the machine tells the story of your topic. This is exactly where perception engineering and reputation hacking collide: you can’t guarantee the story will be neutral, but you can influence which stories and definitions the machine has to choose from.

Once a framing is dominant, it creates a gravitational field. Competing narratives struggle to break in, because the model is optimising for coherence as much as correctness. That’s why the first widely cited definition of a category, or the earliest comprehensive guide to a topic, often becomes the anchor – whether or not it’s perfect. Every subsequent mention is then interpreted, consciously or not, through that lens.

The real collapse of neutrality isn’t bias in the political sense. It’s that “the truth” is increasingly whatever the machine can construct most coherently from the material at hand. And coherence rewards whoever got there first, spoke the clearest, or was repeated most often.

Which means if you don’t help define your category – its language, its exemplars, its boundaries – the machine will do it for you, using whatever scraps it can find. Perception engineering ensures those scraps are yours; reputation hacking helps you insert them quickly when the window is open.

Recalibrating the marketing stack

To be successful, you must treat the machine’s worldview as a product you can influence – and as an ongoing propaganda campaign you’re running in plain sight – with editorial standards, governance, and measurement.

That means that you need:

Governance: someone owns the brand’s “public record”. Not just the site, but the wider corpus that describes you.
Observation: regular belief-testing. Ask top agents the awkward questions you fear customers are asking. Record the answers. Track drift.
Editorial: create “sources of record” – durable, citable material that others use to explain you.
Change management: when reality changes (new product, leadership, policy), plan the narrative update as a programme, not a press release.
Crisis hygiene: have a playbook for fast corrections, long-lived clarifications, and calm follow-ups that age well.

This isn’t new work so much as joined-up work. PR, content, SEO, legal, product. Same orchestra, new conductor.

From ideas to action

The principles we’ve covered – perception engineering and reputation hacking – aren’t abstract labels. They’re two complementary operating modes that inform everything from your editorial process to your crisis comms. Perception engineering sets the long-term gravitational field; reputation hacking is the course correction when reality or risk intrudes.

In practice, they draw from the same toolkit – research, content, partnerships, corrections – but the sequencing, pace, and priority are different. Perception engineering is slow-burning and accumulative; reputation hacking is urgent and surgical.

What follows isn’t “SEO tips” or “PR tricks” – it’s the operationalisation of those two modes. Think of it as building a persistent advantage in the machine’s memory while keeping the agility to steer it when you need to.

Practical applications

The battle for perception isn’t won in the heat of a campaign. It’s won in the quiet, unglamorous maintenance of the record the machine depends on. If its “memory” is the raw material, then perception engineering and reputation hacking are the craft – the fieldwork that keeps that raw material current, coherent, and aligned with your preferred story.

What follows isn’t theory. It’s the operational layer: the things you can do – quietly, methodically – to ensure that when the machine tells your story, it’s working from the version you’d want repeated.

Perception engineering (proactive)

Proactive work is the compound-interest version: it’s slower to show results, but once set, it’s hard to dislodge. This is where you lay down the durable truths, the assets and anchors that will be repeated for years without you having to touch them.

Audit the deep web of your brand: Not just your own site, but press releases, partner microsites, supplier portals, open-license repositories, and archived PDFs. Look for outdated product names, superseded logos, retired imagery, and even mismatched colour palettes. Machines will happily pull any of it into their summaries.
Maintain staff and leadership profiles: Your own team pages, but also speaker bios on conference sites, partner directories, media appearances, and LinkedIn. An ex-employee still billed as “Head of Innovation” on a high-ranking event page can haunt search summaries for years.
Keep organisational clarity: Align public org charts, leadership listings, and governance descriptions across your site, LinkedIn, investor relations, and third-party listings. A machine that sees three different hierarchies will assume the one with the most citations is the “truth” – and it might not be the one you prefer.
Refresh high-authority, long-life assets: Identify the logos, diagrams, and “about” text most often re-used by journalists, analysts, and partners. Replace outdated versions in all the places people (and scrapers) are likely to fetch them.
Define your narrative anchors: Pick the ideas, phrases, and category definitions you’d like attached to your name for the next five years. Name them well, explain them clearly, and seed them in durable sources – encyclopaedic entries, standards bodies, academic syllabi – not just transient campaign pages.

Perception Engineering (reactive)

Reactive work is about patching holes in the hull before the leak becomes the story. It’s faster, more visible, and sometimes more expensive, because you’re competing with whatever’s already in circulation. The goal isn’t just to fix the record – it’s to do so in a way that ages well and doesn’t keep re-surfacing the old problem.

Update the record before the campaign: When something changes – product launch, rebrand, leadership shift – make sure the long-lived references get updated first (Wikipedia, investor materials, industry directories). Campaign assets come second.
Clean up legacy debris: Retire or redirect old content that keeps the wrong story alive. Where removal isn’t possible, add clarifying updates so the old version isn’t the only one available to be quoted.

Reputation Hacking (proactive)

This is the “social engineering” of credibility – done ethically. You’re placing the right facts and framings in the high-gravity sources that machines and people alike draw from. Done consistently, it builds a kind of reputational armour.

Track the gravitational sources: Identify the handful of third-party sites, writers, or communities that punch above their weight in your category. Maintain an accurate, consistent presence there.
Synchronise your language: Ensure spokespeople, PR, product, and content teams are describing the brand in the same terms, so repetition works in your favour – and machines see one coherent narrative, not a jumble of similar-but-different descriptors.

Reputation Hacking (reactive)

This is triage. You can’t always prevent distortions, but you can choose where and how to counter them so the fix lives longer than the fault. It’s also where the temptation to over-correct can backfire; you want a clean resolution, not an endless duel that keeps the bad version alive.

Respond where it will linger: When a skewed narrative surfaces, publish the correction or context in the source most likely to be cited next year – not just the one trending today.
Offer clarifications that age well: Use timelines, primary data, and named accountability rather than ephemeral rebuttals. Once that’s in the record, resist the temptation to keep stoking the conversation – you want the durable correction, not the endless back-and-forth.

Where to start

The fastest way to see how the machine sees you is to ask it. Pick three or four leading AI search tools and prompt them the way a customer, investor, or journalist might. Don’t just check the facts – listen for tone, framing, and what gets left out.

Then work backwards: which pieces of the public record are feeding those answers? Which of them could you update, clarify, or strengthen today? You don’t have to rewrite your whole history at once. Just start with the handful of durable, high-visibility assets that most shape the summaries – because those will be the roots every new narrative grows from.

Closing the loop

In the old search era, the prize was the click. In the agent era, the prize is the story – and once a version of that story lodges in the machine’s memory, it calcifies. You can chip at it, polish it, add new chapters… but moving the core narrative takes years.

Propaganda, perception engineering, reputation hacking – call it what you like. The point is the same: you’re no longer just marketing to people; you’re marketing to the machines that will introduce you to them.

Ignore that, and you’re effectively letting someone else write your opening paragraph – the one the machine will read aloud forever. Play it well, and your version becomes the one every other retelling has to work to dislodge.

The post On propaganda, perception, and reputation hacking appeared first on Jono Alderson.

There’s no such thing as a backlink

Jono Alderson

Yazar:Jono Alderson

12 Ağustos 2025 saat 08:30

A link is not a “thing” you can own. It’s not a point in a spreadsheet, or a static object you can collect and trade like a baseball card.

And it’s certainly not a “one-way vote of confidence” in your favour.

A link is a relationship.

Every link connects two contexts: a source and a destination. That connection exists only in the relationship between those two points – inheriting its meaning, relevance, and trust from both ends at once. Break either end, and the link collapses. Change either end, and its meaning shifts.

If you want to understand how search engines, LLMs, and AI agents perceive and traverse the web, you have to start from this idea: links are not things. They are edges in a graph of meaning and trust.

Why “backlink” is a problem

That’s why “backlink” is such a loaded, dangerous word.

The moment you call it a backlink, you flatten the concept into something purely about you. You stop thinking about the source. You stop thinking about why the link exists. You strip away its context, its purpose, its role in the broader ecosystem.

And what’s on the other side – is that a “forward link”? Of course not. We’d never use that phrase because it’s absurd. Yet “backlink” has been normalised to the point where we’ve trained ourselves to see only one direction: inbound to us.

This isn’t harmless shorthand. It’s an active simplification – a way of collapsing something messy and multi-dimensional into a clean, one-directional metric that fits neatly in a monthly report.

Flattening complexity for convenience

The real problem with “backlink” isn’t just that it’s inaccurate – it’s that it’s convenient.

Modelling, tracking, and valuing the true nature of a link – as a relationship between two entities, grounded in trust, context, and purpose – is complicated. It’s hard to scale. It doesn’t always fit neatly in a dashboard.

Flatten it into “backlink count,” and suddenly you have a number. You can set a target, buy some, watch the line go up. It doesn’t matter if the links are contextless, untrusted, or fragile – the KPI looks good.

That’s why so many bought links don’t move the needle. They’re designed to satisfy the simplified model, not the underlying reality. You’re optimising for the report, not the algorithm.

The industry’s other convenient fictions

This isn’t just about “backlinks” or “link counts.” The link economy thrives on invented terminology because it turns the intangible into something tradable:

“Journalist links” aren’t a distinct species of link. They’re just… links. Links from journalists, sure, but still subject to the same rules of trust, context, and relevance as everything else. Calling them “journalist links” lets agencies sell them as a premium product, implying some magic dust that doesn’t exist.
“Niche edits” is a euphemism for “retroactively inserting a link into an existing page.” In reality, the practice often creates a weaker connection than the original content warranted, and risks breaking the source’s context entirely. But “niche edit” sounds tidy, productised, and easy to buy.
“DoFollow links” don’t exist. Links are followable by default, and even nofollow is more of a hint than a block. The term was invented to make the normal behaviour of the web sound like a special feature you can pay for.

There are dozens of these terms, all designed to artificially flatten and simplify, in a way which is deeply harmful.

And then there’s “link building”

“Link building” might be the most damaging term of all.

It makes the whole process sound mechanical. Industrial. Like you’re stacking identical units until you hit quota. The phrase itself erases the reality that the value of a link is inseparable from why it exists, who created it, and whether trust flows through it.

Yes, you can “build” a collection of links. You can even hit your targets. But if those links aren’t grounded in trust, context, and mutual relevance, you haven’t built anything with lasting value. You’ve just arranged numbers in a report.

Real links – the kind that carry authority, relevance, and resilience over time – aren’t built. They’re earned. They emerge from relationships, collaboration, and shared purpose.

The web is not a ledger

The mental model that search engines are just “counting backlinks” is hopelessly outdated.

The web is not a static ledger of inbound links. It’s a living, constantly shifting graph of relationships – semantic, topical, and human.

For a search engine, a link is one of many signals. It inherits meaning from:

The page it’s on – its quality, trustworthiness, and topic.
The words around it – anchor text, surrounding copy, and implicit associations.
The nature of the source – how it connects to other sites and pages, its history, and its place in the graph.
The wider topology – how that connection interacts with other connections in the ecosystem.

This is the reality that “backlink count” and “link building” both paper over – the algorithm is modelling relationships of trust, not transactions.

Search engines, LLMs, and agents don’t care about “backlinks”

Here’s the crucial shift: the future of discovery won’t be “ten blue links” driven by link-counted rankings.

LLMs and AI agents don’t think in backlinks at all. They parse the web as a network of entities, concepts, and connections. They care about how nodes in that network relate to each other – how trust, authority, and relevance propagate along the edges.

Yes, they may still evaluate links (directly, or indirectly). In fact, links can be a useful grounding signal: a way of connecting claims to sources, validating relationships, and reinforcing topical associations. But those links are never considered in isolation. They’re evaluated alongside everything else – content quality, author credibility, entity relationships, usage data, and more.

That makes artificially “built” links stand out. Contextless, untrusted, or irrelevant links are easy to spot against the backdrop of a richer, more integrated model of the web. And easy to ignore.

In that world, a “backlink” as the SEO industry defines it – a one-way token of PageRank – is almost meaningless. What matters is the relationship: why the link exists, what it connects, what concepts it reinforces, and how it integrates into the larger graph.

Why the language persists

The reason we still say “backlink” and “link building” isn’t because they’re the best descriptors. It’s because they’re useful – for someone else.

Vendors, brokers, and marketplaces love these terms. They make something messy, relational, and human sound like a measurable commodity. That makes it easier to sell, easier to buy, and easier to report on.

If you frame links as “relationships” instead, you make the job harder – and you make the value harder to commoditise. Which is precisely why the industry’s resale economy prefers the simpler fiction.

Optimising for the wrong web

If your mental model is still “get more backlinks” or “build more links,” you’re optimising for the wrong web.

The one we’re already in doesn’t reward accumulation – it rewards integration. It rewards being part of a meaningful network of relevant, trusted, and semantically connected entities.

That means:

Stop chasing raw counts.
Stop buying neat-sounding products that exist to make reporting easy.
Start building relationships that make sense in context.
Think about how your site fits into the broader topical and semantic ecosystem.
Design links so they deserve to exist, and make sense from both ends.

The takeaway

There is no such thing as a backlink.

There is no such thing as “DoFollow links,” “journalist links,” or “niche edits.”

And if “link building” is your strategy, you’re already thinking in the wrong dimension.

There are only relationships – some of which happen to be expressed through HTML <a> elements.

If you want to thrive in a search environment increasingly shaped by AI, entity graphs, and trust networks, stop flattening complexity and start earning your place in the web’s map of meaning.

Stop chasing backlinks. Stop buying fictions.

Start building relationships worth mapping.

The post There’s no such thing as a backlink appeared first on Jono Alderson.

Standing still is falling behind

Jono Alderson

Yazar:Jono Alderson

11 Ağustos 2025 saat 13:37

“Our traffic’s down, but nothing’s changed on our website.”

This is one of the most common refrains in digital marketing. The assumption is that stability is safe; that if you’ve left your site alone, you’ve insulated yourself from volatility.

But the internet isn’t a museum. It’s a coral reef – a living ecosystem in constant flux. Currents shift. New species arrive. Old ones die. Storms tear chunks away. You can sit perfectly still and still be swept miles off course.

In this environment, “nothing changed” isn’t a defence. It’s an admission of neglect.

The myth of stability

When you measure performance purely against your own activity, it’s easy to believe that you exist in a stable vacuum. That your rankings, your traffic, and your conversion rate are a sort of natural equilibrium.

They’re not.

What you’re looking at is the current balance of power in a chaotic network of content, commerce, and culture. That balance shifts every second. Even if you do nothing, the environment around you is mutating – algorithms are recalibrating, competitors are making moves, new pages are earning links, and public attention is being diverted elsewhere.

The obvious changes

Some of the forces reshaping your position are easy to spot:

Competitors launching aggressive sales or product releases.
A rival migrating their site, and creating a temporary rankings gap.
Search trends shifting as customer needs evolve.

These are the obvious changes. You can see them coming, at least if you’re paying attention.

But often, the biggest hits to your performance come from events so far outside your immediate view that you don’t even think to look for them.

The invisible shifts

The web’s link graph, attention economy, and user behaviour patterns are constantly being reshaped by events you’d never imagine could affect you. Here are just a few ways your numbers can move without you touching a thing.

1. Wikipedia editing sprees

A niche documentary airs on TV, and suddenly thousands of people are editing related Wikipedia articles. Those pages rise in prominence, gain links, and reshape the web’s internal authority flow. Your carefully nurtured evergreen content in that space loses a few points of link equity, and rankings slip.

2. Celebrity deaths

A public figure dies. News sites, fan pages, and archives flood the web. Search demand spikes for their work, quotes, and history. For weeks, this attention warps the SERPs, pushes down unrelated content, and changes linking patterns.

3. Seasonal cultural juggernauts

By mid-October, Michael Bublé and Mariah Carey are already thawing out for Christmas, and seasonal content starts hoovering up clicks, ad inventory, and search attention. Your evergreen “winter wellness” content is suddenly in a knife fight with mince pie recipes and gift guides.

4. Platform and policy changes

Reddit tweaks its API pricing. Popular third-party apps die. Browsing habits change overnight. Millions of users are now encountering, sharing, and linking to content differently. Your steady “traffic from Reddit” graph turns into a cliff.

5. Macro news events

The Suez Canal gets blocked by a container ship. Suddenly, every global shipping blog post from 2017 is back on page one, displacing your carefully optimised supply chain guide.

6. Retail collapses

A high-street chain goes bankrupt. Hundreds of high-authority product and category pages vanish. The link equity they were holding gets redistributed across the web, reshaping rankings even in unrelated verticals.

7. Weird pop culture blips

A Netflix series resurrects a 20-year-old cake recipe. Overnight, tens of thousands search for it. If it’s on your food blog (and easy to find) you ride the wave. If it’s buried on page six of your “Other Baking Ideas” tag archive, or hidden behind a bloated recipe plugin, you don’t even get a crumb.

8. Major sporting events

The Olympics, the World Cup, the Super Bowl – these pull public attention, time, and disposable income into one giant funnel. For weeks, people spend differently, travel differently, and think about entirely different things. You can lose traffic and sales even if your market is nowhere near sports.

9. Political and economic ripples

Political tensions disrupt the supply of rare metals. Prices rise. Manufacturers delay or cancel product launches. Consumer tech coverage dries up. Search interest shifts to alternatives. Somewhere down the chain, your site, which sells something only vaguely connected, sees fewer visits and lower conversions, for reasons you’ll never see in Google Search Console.

How these ripples spread

These events change the digital landscape through a few predictable, but largely invisible, mechanisms:

Link graph redistribution – When big, authoritative pages gain or lose prominence, the “trust” and equity they pass shifts across the web.
SERP reshuffles – New, high-interest content pushes existing results down, sometimes permanently.
Attention cannibalisation – Cultural moments draw clicks and ad spend away from unrelated topics.
Behavioural shifts – Users change how they search, where they click, and what they expect to see.

You might never connect these cause-and-effect chains directly, but the effects are real. And they’re happening all the time.

Why ‘nothing changed’ is dangerous

Digital performance is a zero-sum game. Rankings, visibility, and attention are finite. When the environment changes, some people win and others lose.

If you’re standing still while everyone else adapts – or while macro events tilt the playing field – you’re not holding position. You’re drifting backwards. And the longer you stand still, the more ground you lose.

What to do instead

You can’t stop the reef from shifting. But you can make sure you’re swimming with it. That means adopting a mindset and an operating rhythm that treats change as the default state.

Monitor markets
Not just your own, but the cultural, economic, and technological currents that shape your audience’s world. Look for leading indicators – industry chatter, policy debates, seasonal mood shifts.
Continually evolve, innovate, and adapt
Change is oxygen; without it, your strategy asphyxiates. Tweak, test, and adjust regularly – even when things feel “fine.”
Remember that nothing is sacred
No page, product, or process is untouchable. If it’s not delivering value in the current environment, change it.
Treat nothing as finished
Your content, your UX, your strategy – they’re all drafts. There is no final version.
Improve 100 small things in 100 small ways every day
Compounding micro-improvements beat sporadic overhauls. Small gains stack over time. Don’t ever stop and wait 6 months for the site redesign project you’ve been promised (because it’ll almost certainly take 18 months).

The web won’t wait for you

Your website doesn’t live in isolation. It’s part of a sprawling, shifting network of pages, links, and human behaviour. Events you’ll never see coming will keep tilting the playing field.

If your digital strategy is ‘nothing’s changed’, you’re not monitoring the map – you’re standing still while the land beneath you sinks into the ocean.

Change is the baseline. Adapting to it is the job.

The post Standing still is falling behind appeared first on Jono Alderson.

Shaping visibility in a multilingual internet

Jono Alderson

Yazar:Jono Alderson

8 Ağustos 2025 saat 20:05

Everyone thinks they understand localisation.

Translate your content. Add hreflang tags to your pages. Target some regional keywords. Job done.

But that isn’t localisation. That’s sales enablement, with a dash of technical SEO.

Meanwhile, the systems that decide whether you’re found, trusted, and recommended – Google’s algorithms, large language models, social platforms, knowledge graphs – are being shaped by content, conversations, and behaviours happening in languages you’ll never read, in markets you’ll never serve.

And most brands aren’t even aware of it.

The old approach to localisation assumes neat boundaries. You sell in a country, so you translate your site. You want to rank there, so you create localised content and generate local coverage.

It’s tidy, measurable, and built on the comforting idea that you only need to care about the markets you serve.

But the internet doesn’t work like that anymore.

Content leaks. People share. Platforms aggregate. Machines consume indiscriminately.

A blog post in Polish might feed into a model’s understanding of a concept you care about. That model might use that understanding when generating an English-language answer for your audience.

A Japanese forum thread might mention your product, boosting its perceived authority in Germany.

A Spanish-language review site might copy chunks of your English product description into a page that ranks in Mexico, creating a citation network you didn’t build and can’t see.

Even if you’ve never touched those markets, they can (and do) influence how you’re found where you do compete.

And here’s the kicker: if influence can flow in from those markets, it can flow out of them too.

That means you can – and in some cases should – actively create influence in markets you’ll never sell to. Not to acquire customers, but to influence the systems. A strong citation in Turkish, a few strategic mentions in Portuguese, or a cluster of references in Korean might shape how a language model or search engine understands your brand in English.

Yes, some of this looks like old-fashioned international PR. The difference is that we’re not optimising for direct human response. We’re optimising for how machines ingest and interpret those signals. That shift changes the “where” and “why” of the work entirely.

You’re not just trying to be visible in more places. You’re trying to be influential where the influence flows from.

Search isn’t local, and machines aren’t either

The classic SEO playbook felt localised because search engines presented themselves that way. You had a .com for the US, a .de for Germany, a .es for Spain, and Google politely asked which version of your content belonged in which market.

But that neatness was always a façade. The index has always been porous, and now, with language models increasingly integrated into how content is ranked, recommended, and summarised, the boundaries have all but collapsed.

Large language models don’t care about your ccTLD strategy.

They’re trained on vast multilingual datasets. They don’t just learn from English – they absorb patterns, associations, and relationships across every language they can get their hands on.

But that absorption is messy. The training corpus is uneven. Some languages are well-represented; others are fragmented, biased, or dominated by spam and low-quality translations.

That means the model’s understanding of your brand – your products, your reputation, your expertise – might be shaped by poor-quality data in languages you’ve never published in. A scraped product description in Romanian. A mistranslation in Korean. A third-party reference in Turkish that subtly misrepresents what you do.

Worse, models interpolate. If there’s limited information in one language, they fill in the blanks using content from others. Your reputation in English becomes the proxy for how you’re understood in Portuguese. A technical blog post in German might colour how your brand is interpreted in a French answer, even if the original wasn’t about you at all.

You don’t get to decide which pieces get surfaced, or combined, or misunderstood. If you’re not present in the corpus – or if you’re present in low-quality ways – you’re vulnerable to being misrepresented; not just in language, but in meaning.

And while we can’t yet produce a neat chart showing “X citations in Portuguese equals Y uplift in US search”, we can point to decades of evidence that authority, entity associations, and knowledge-graph inputs cross linguistic and geographic boundaries.

Absence is its own liability

And here’s the uncomfortable bit: not being present doesn’t make you safe. It makes you vulnerable.

If your brand has no footprint in a market – no content, no reputation, no signals – that doesn’t mean machines ignore you. It means they guess.

They extrapolate from what exists elsewhere. They fill in the blanks. They make assumptions based on similar-sounding companies, related products, or low-context mentions from other parts of the web.

A brand that does have localised content – even if it’s thin or mediocre – might be treated as more trustworthy or relevant by default. A poorly translated competitor page might become the canonical representation of your product category. A speculative blog post might be treated as the truth.

You don’t have to be a multinational to be affected by this. If your products get reviewed on Amazon in another country, or your services get mentioned in a travel blog, you’re already part of the multilingual ecosystem. The question is whether you want to shape that, or to leave it to chance.

This is the dark side of “AI-powered” summarisation and assistance: it doesn’t know what it doesn’t know, and if you’re not present in a given locale, it will invent or import context from somewhere else.

Sometimes, the most damaging thing you can do is nothing.

What to do about it (and what that doesn’t mean)

This isn’t a call to translate your entire site into 46 languages, or to buy every ccTLD, or to build a localised blog for every market you’ve never entered.

But it is a call to be deliberate.

If your brand is already being interpreted, categorised, and described across languages – by people and machines alike – then your job is to start shaping that system.

This isn’t about expanding your market footprint. It’s about shaping the environment that the machines are learning from.

In some cases, that means being deliberately present in languages, regions, or contexts that will never become customers; but which do feed into the training data, ranking systems, and reputational scaffolding that determine your visibility where it matters.

You’re not building local authority. You’re influencing global interpretation. Here’s how.

🎯 Identify and shape your multilingual footprint

Spot opportunities where your brand is being talked about – but not by you – and consider replacing, reframing, or reinforcing those narratives.
Audit where your brand is already being mentioned, cited, or discussed across different languages and countries.
Prune low-quality, duplicate, or mistranslated content if it’s polluting the ecosystem (especially if it’s scraped or machine-translated).

🎙️ Influence the sources that matter to the machines

Focus less on user acquisition and more on shaping the ambient data that teaches the system what your brand is.
Find the publications, journalists, influencers, and platforms in non-English markets that LLMs and search engines are likely to trust and ingest.
Get your CEO interviewed on a relevant industry podcast in Dutch. Sponsor an academic paper in Portuguese. Show up where the training data lives.

🧱 Create multilingual anchor points – intentionally

Place these strategically in markets or languages where influence is leaking in, or where hallucinations are most likely to occur.
You don’t need full localisation. Sometimes, a single “About Us” page, or a translated version of your flagship research piece, is enough.
Make sure it’s accurate, high-quality, and clearly associated with your brand, so it becomes a source, not just an artefact.

🌐 Target visibility equity – not market share

Create resources in other languages not for search traffic, but for the reputational halo – in both search and LLMs.
Think of earned media and coverage as multilingual influence building. You’re not just trying to rank in Italy – you’re trying to be known in Italian.
Choose where you want to earn mentions and citations based on where those signals might shape visibility elsewhere.

🔍 Understand behavioural and linguistic variance – then act accordingly

Not all searchers behave the same way. Not all languages structure ideas the same way.

If you’re creating content, earning coverage, or trying to generate signals in a new market, you can’t just translate your existing strategy, because:

Searcher behaviour varies: some markets prefer long-tail informational queries; others are more transactional or brand-led.
Colloquialisms and structures vary: the way people express a need in Spanish isn’t a word-for-word translation of how they’d do it in English.
Cultural norms differ: what earns attention in one region might fall flat (or backfire) in another.
Buying behaviour varies: local trust factors, pricing sensitivity, and even UX expectations can impact the credibility of your content or product.

Sometimes, the best tactic isn’t to translate your English landing page into Dutch; it’s to write a new one that reflects how Dutch buyers actually think, search, and decide.

The mindset shift

This isn’t localisation for customers. It’s localisation for the systems that decide how customers see you.

You don’t need to be everywhere, but you do need to be understood everywhere. Not because you want to sell there, but because the reputation you build in one language will inevitably leak into others – and into the systems that shape search results, summaries, and recommendations in your actual markets.

Sometimes that means showing up in markets you’ll never monetise. Sometimes it means letting go of the neat, tidy boundaries between “our audience” and “everyone else.”

The choice isn’t between being visible or invisible in those places. It’s between being defined by your own hand, or by the fragments, translations, and half-truths left behind by others.

The brands that understand this and act on it won’t just be present in their markets. They’ll be present in the global dataset that the machines learn from. And that’s where the real competition is now.

The post Shaping visibility in a multilingual internet appeared first on Jono Alderson.

What the world’s first factory can teach us about AI

Jono Alderson

Yazar:Jono Alderson

25 Temmuz 2025 saat 15:13

I just visited Cromford Mills – the world’s first modern factory.

There’s something quietly unsettling about standing on the literal foundations of industrialisation.

Where water wheels turned into looms, and looms turned people into components.

It’s peaceful now. A heritage site. A few restored machines. A few shops selling jam and woolen gifts.

But in 1771, this place broke the world.

Richard Arkwright wasn’t building a museum. He was solving a technical problem: how to increase the efficiency of cotton spinning. What he built instead was the prototype for the modern factory.

This was the start of abstraction.
The decoupling of labour from craft.
The optimisation of tasks into systems.
The moment where time, process, and people became programmable.

And I can’t stop thinking about how familiar that feels.

We like to talk about AI in terms of capabilities. What it can do. What it gets wrong. Whether it’ll replace copywriters or software engineers, and when.

But standing in that mill, it was obvious that these revolutions aren’t about capabilities.

They’re about structures.

AI isn’t just taking on tasks.
It’s quietly reshaping how work is defined.
What expertise looks like.
Where value sits in a system.
How decisions get made – and by whom.

Just like the mills, it’s happening unevenly. Messily. With moments of brilliance and horror in equal measure.

And what struck me most was how small it all looked.

The birthplace of the factory. The epicentre of global disruption, which defined an epoch.

Now a sleepy museum tucked behind a tea room.

And I wondered:

What parts of our world will end up behind glass?
Which interfaces, roles, or assumptions will seem naïve in hindsight?
Will we visit old prompt libraries the way we now marvel at spinning frames?

Because revolutions never look like revolutions when they’re happening. They look like productivity tools. Like optimisations. Like demos on stage.

Cromford made humans legible to machines. AI is now making machines legible to humans. The loop is closing.

And somewhere in that process, everything changes.

The post What the world’s first factory can teach us about AI appeared first on Jono Alderson.

Why semantic HTML still matters

Jono Alderson

Yazar:Jono Alderson

21 Temmuz 2025 saat 20:03

Somewhere along the way, we forgot how to write HTML – or why it mattered in the first place.

Modern development workflows prioritise components, utility classes, and JavaScript-heavy rendering. HTML becomes a byproduct, not a foundation.

And that shift comes at a cost – in performance, accessibility, resilience, and how machines (and people) interpret your content.

I’ve written elsewhere about how JavaScript is killing the web. But one of the most fixable, overlooked parts of that story is semantic HTML.

This piece is about what we’ve lost – and why it still matters.

Semantic HTML is how machines understand meaning

HTML isn’t just how we place elements on a page. It’s a language – with a vocabulary that expresses meaning

Tags like <article>, <nav> and <section> aren’t decorative. They express intent. They signal hierarchy. They tell machines what your content is, and how it relates to everything else.

Search engines, accessibility tools, AI agents, and task-based systems all rely on structural signals – sometimes explicitly, sometimes heuristically. Not every system requires perfect markup, but when they can take advantage of it, semantic HTML can give them clarity. And in a web full of structurally ambiguous pages, that clarity can be a competitive edge.

Semantic markup doesn’t guarantee better indexing or extraction – but it creates a foundation that systems can use, now and in the future. It’s a signal of quality, structure, and intent.

If everything is a <div> or a <span>, then nothing is meaningful.

It’s not just bad HTML – it’s meaningless markup

It’s easy to dismiss this as a purity issue. Who cares whether you use a <div> or a <section>, as long as it looks right?

But this isn’t about pedantry. Meaningless markup doesn’t just make your site harder to read – it makes it harder to render, harder to maintain, and harder to scale.

This kind of abstraction leads to markup that often looks like this:

<div class="tw-bg-white tw-p-4 tw-shadow tw-rounded-md">
  <div class="tw-flex tw-flex-col tw-gap-2">
    <div class="tw-text-sm tw-font-semibold tw-uppercase tw-text-gray-500">ACME Widget</div>
    <div class="tw-text-xl tw-font-bold tw-text-blue-900">Blue Widget</div>
    <div class="tw-text-md tw-text-gray-700">Our best-selling widget for 2025. Lightweight, fast, and dependable.</div>
    <div class="tw-mt-4 tw-flex tw-items-center tw-justify-between">
      <div class="tw-text-lg tw-font-bold">$49.99</div>
      <button class="tw-bg-blue-600 tw-text-white tw-px-4 tw-py-2 tw-rounded hover:tw-bg-blue-700">Buy now</button>
    </div>
  </div>
</div>

Sure, this works. It’s styled. It renders. But it’s semantically dead.

It gives you no sense of what this content is. Is it a product listing? A blog post? A call to action?

You can’t tell at a glance – and neither can a screen reader, a crawler, or an agent trying to extract your pricing data.

Here’s the same thing with meaningful structure:

<article class="product-card">
  <header>
    <p class="product-brand">ACME Widget</p>
    <h1 class="product-name">Blue Widget</h1>
  </header>
  <p class="product-description">Our best-selling widget for 2025. Lightweight, fast, and dependable.</p>
  <footer class="product-footer">
    <span class="product-price">$49.99</span>
    <button class="buy-button">Buy now</button>
  </footer>
</article>

Now it tells a story. There’s structure. There’s intent. You can target it in your CSS. You can extract it in a scraper. You can navigate it in a screen reader. It means something.

Semantic HTML is the foundation of accessibility. Without structure and meaning, assistive technologies can’t parse your content. Screen readers don’t know what to announce. Keyboard users get stuck. Voice interfaces can’t find what you’ve buried in divs. Clean, meaningful HTML isn’t just good practice – it’s how people access the web.

That’s not to say frameworks are inherently bad, or inaccessible. Tailwind, atomic classes, and inline styles can absolutely be useful – especially in complex projects or large teams where consistency and speed matter. They can reduce cognitive overhead. They can improve velocity.

But they’re tools, not answers. And when every component devolves into a soup of near-duplicate utility classes – tweaked for every layout and breakpoint – you lose the plot. The structure disappears. The purpose is obscured.

This isn’t about abstraction. It’s about what you lose in the process.

And that loss doesn’t just hurt semantics – it hurts performance. In fact, it’s one of the biggest reasons the modern web feels slower, heavier, and more fragile than ever.

Semantic rot wrecks performance

We’ve normalised the idea that HTML is just a render target – that we can throw arbitrary markup at the browser and trust it to figure it out. And it does. Browsers are astonishingly good at fixing our messes.

But that forgiveness has a cost.

Rendering engines are designed to be fault-tolerant. They’ll infer roles, patch up bad structure, and try to render things as you intended. But every time they have to do that – every time they have to guess what your <div> soup is trying to be – it costs time. That’s CPU cycles. That’s GPU time. That’s power, especially on mobile.

Let’s break down where and how the bloat hits hardest – and why it matters.

Big DOMs are slow to render

Every single node in the DOM adds overhead. During rendering, the browser walks the DOM tree, builds the CSSOM, calculates styles, resolves layout, and paints pixels. More nodes mean more work at each stage.

It’s not just about download size (though that matters too – more markup means more bytes, and potentially less efficient compression). It’s about render performance. A bloated DOM means longer layout and paint phases, more memory usage, and higher energy usage.

Even simple interactions – like opening a modal or expanding a list – can trigger reflows that crawl through your bloated DOM. And suddenly your “simple” page lags, stutters, or janks.

You can see this in Chrome DevTools. Open the Performance tab, record a trace, and watch the flame chart light up every time your layout engine spins it’s wheels.

Fun fact: parsing isn’t the bottleneck—browsers like Chromium can process HTML at tens of GB/s on modern CPUs. The real cost comes during CSSOM construction, layout, paint, and composite. Also, HTML parsing is blocking only when you hit a non-deferred <script> or a render-blocking stylesheet – which again underscores why clean markup still matters, but you also need smart loading order.

Complex trees cause layout thrashing

But it’s not just about how much markup you have – it’s about how it’s structured. Deep nesting, wrapper bloat, and overly abstracted components create DOM trees that are hard to reason about and costly to render. The browser has to work harder to figure out what changes affect what – and that’s where things start to fall apart.

Toggle a single class, and you might invalidate layout across the entire viewport. That change cascades through parent-child chains, triggering layout shifts and visual instability. Components reposition themselves unexpectedly. Scroll anchoring fails, and users lose their position mid-interaction. The whole experience becomes unstable.

And because this all happens in real time – on every interaction – it hits your frame budget. Targeting 60fps? That gives you just ~16ms per frame. Blow that budget, and users feel the lag instantly.

You’ll see it in Chrome’s DevTools – in the “Layout Shift Regions” or in the “Frames” graph as missed frames stack up.

When you mutate the DOM, browsers don’t always re-layout the whole tree – there’s incremental layout processing. But deeply nested or ambiguous markup still triggers expensive ancestor checks. Projects like Facebook’s “Spineless Traversal” show that browsers still pay a performance penalty when many nodes need checking.

Redundant CSS increases recalculation cost

A bloated DOM is bad enough – but bloated stylesheets make things even worse.

Modern CSS workflows – especially in componentised systems – often lead to duplication. Each component declares its own styles – even when they repeat. There’s no cascade. No shared context. Specificity becomes a mess, and overrides are the default.

For example, here’s what that often looks like:

/* button.css */
.btn {
  background-color: #006;
  color: #fff;
  font-weight: bold;
}

/* header.css */
.header .btn {
  background-color: #005;
}

/* card.css */
.card .btn {
  background-color: #004;
}

Each file redefines the same thing. The browser has to parse, apply, and reconcile all of it. Multiply this by hundreds of components, and your CSSOM – the browser’s internal model of all CSS rules – balloons.

Every time something changes (like a class toggle), the browser has to re-evaluate which rules apply where. More rules, more recalculations. And on lower-end devices, that becomes a bottleneck.

Yes, atomic CSS systems like Tailwind can reduce file size and increase reuse. But only when used intentionally. When every component gets wrapped in a dozen layers of utility classes, and each utility is slightly tweaked (margin here, font there), you end up with thousands of unique combinations – many of which are nearly identical.

The cost isn’t just size. It’s churn.

Browsers match selectors from right to left (e.g., for div.card p span, they check → parent → etc). This is efficient for clear, specific selectors – but bloated deep trees or generic cascading rules force lots of overs canning.

Autogenerated classes break caching and targeting

It’s become common to see class names like .sc-a12bc, .jsx-392hf, or .tw-abc123. These are often the result of CSS-in-JS systems, scoped styles, or build-time hashing. The intent is clear: localise styles to avoid global conflicts. And that’s not a bad idea.

But this approach comes with a different kind of fragility.

If your classes are ephemeral – if they change with every build – then:

Your analytics tags break.
Your end-to-end tests need constant maintenance.
Your caching strategies fall apart.
Your markup diffs become unreadable.
And your CSS becomes non-reusable by default.

From a performance perspective, that last point is critical. Caching only works when things are predictable. The browser’s ability to cache and reuse parsed stylesheets depends on consistent selectors. If every component, every build, every deployment changes its class names, the browser has to reparse and reapply everything.

Worse, it forces tooling to rely on brittle workarounds. Want to target a button in your checkout funnel via your tag manager? Good luck if it’s wrapped in three layers of hashed components.

This isn’t hypothetical. It’s a common pain point in modern frontend stacks, and one that bloats everything – code, tooling, rendering paths.

Predictable, semantic class names don’t just make your life easier. They make the web faster.

Semantic tags can provide layout hints

Semantic HTML isn’t just about meaning or accessibility. It’s scaffolding. Structure. And that structure gives both you and the browser something to work with.

Tags like <main>, <nav>, <aside>, and <footer> aren’t just semantic – they’re block-level by default, and they naturally segment the page. That segmentation often lines up with how the browser processes and paints content. They don’t guarantee performance wins, but they create the conditions for them.

When your layout has clear boundaries, the browser can scope its work more effectively. It can isolate style recalculations, avoid unnecessary reflows, and better manage things like scroll containers and sticky elements.

More importantly: in the paint and composite phases, the browser can distribute rendering work across multiple threads. GPU compositing pipelines benefit from well-structured DOM regions – especially when they’re paired with properties like contain: paint or will-change: transform. By creating isolated layers, you reduce the overhead of re-rasterising large portions of the page.

If everything is a giant stack of nested <div>s, there’s no clear opportunity for this kind of isolation. Every interaction, animation, or resize event risks triggering a reflow or repaint that affects the entire tree. You’re not just making it harder for yourself – you’re bottlenecking the rendering engine.

Put simply: semantic tags help you work with the browser instead of fighting it. They’re not magic, but they make the magic possible.

Animations and the compositing catastrophe

Animations are where well-structured HTML either shines… or fails catastrophically.

Modern browsers aim to offload animation work to the GPU. That’s what enables silky-smooth transitions at 60fps or higher. But for that to happen, the browser needs to isolate the animated element onto its own compositing layer. Only certain CSS properties qualify for this kind of GPU-accelerated treatment – most notably transform and opacity.

If you animate something like top, left, width, or margin, you’re triggering the layout engine. That means recalculating layout for everything downstream of the change. That’s main-thread work, and it’s expensive.

On a simple page? Maybe you get away with it.

On a deeply nested component with dozens of siblings and dependencies? Every animation becomes a layout thrash. And once your animation frame budget blows past 16ms (the limit for 60fps), things get janky. Animations stutter. Interactions lag. Scroll becomes sluggish.

You can see this in DevTools’ Performance panel – layout recalculations, style invalidations, and paint operations lighting up the flame chart.

Semantic HTML helps here too. Proper structural boundaries allow for more effective use of modern CSS containment strategies:

contain: layout; tells the browser it doesn’t need to recalculate layout outside the element.

will-change: transform; hints that a compositing layer is needed.

isolation: isolate; and contain: paint; can help prevent visual spillover and force GPU layers.

But these tools only work when your DOM is rational. If your animated component is nested inside an unpredictable pile of generic <div>s, the browser can’t isolate it cleanly. It doesn’t know what might be affected – so it plays it safe and recalculates everything.

That’s not a browser flaw. It’s a developer failure.

Animation isn’t just about what moves. It’s about what shouldn’t.

Rendering and painting are parallel operations in modern engines. But DOM/CSS changes often force main-thread syncs, killing that advantage.

CSS layering via will-change: transform or the newer layer() syntax tells the GPU to handle composites separately. That avoids layout and paint in the main thread – but only when the DOM structure allows distinct layering containers.

CSS containment and visibility: powerful, but fragile

Modern CSS gives us powerful tools to manage performance – but they’re only effective when your HTML gives them room to breathe.

Take contain. You can use contain: layout, paint, or even size to tell the browser “don’t look outside this box – nothing in here affects the rest of the page.” This can drastically reduce the cost of layout recalculations, especially in dynamic interfaces.

But that only works when your markup has clear structural boundaries.

If your content is tangled in a nest of non-semantic wrappers, or if containers inherit unexpected styles or dependencies, then containment becomes unreliable. You can’t safely contain what you can’t isolate. The browser won’t take the risk.

Likewise, content-visibility: auto is one of the most underrated tools in the modern CSS arsenal. It lets the browser skip rendering elements that aren’t visible on-screen – effectively “virtualising” them. That’s huge for long pages, feeds, or infinite scroll components.

But it comes with caveats. It requires predictable layout, scroll anchoring, and structural coherence. If your DOM is messy, or your components leak styles and dependencies up and down the tree, it backfires – introducing layout jumps, rendering bugs, or broken focus states.

These aren’t magic bullets. They’re performance contracts. And messy markup breaks those contracts.

Semantic HTML – and a clean, well-structured DOM – is what makes these tools viable in the first place.

MDN’s docs highlight how contain: content (shorthand for layout+paint+style) lets browsers optimize entire subtrees independently
Real-world A/B tests show INP latency improvements on e‑commerce pages using content-visibility: auto.

Agents are the new users – and they care about structure

The web isn’t just for humans anymore.

Search engines were the first wave – parsing content, extracting meaning, and ranking based on structure and semantics. But now we’re entering the era of AI agents, assistants, scrapers, task runners, and LLM-backed automation. These systems don’t browse your site. They don’t scroll. They don’t click. They parse.

They look at your markup and ask:

What is this?
How is it structured?
What’s important?
How does it relate to everything else?

A clean, semantic DOM answers those questions clearly. A soup of <div>s does not.

And when these agents have to choose between ten sites that all claim to sell the same widget, the one that’s easier to interpret, extract, and summarise will win.

That’s not hypothetical. Google’s shopping systems, summarisation agents like Perplexity, AI browsers like Arc, and assistive tools for accessibility are all examples of this shift in motion. Your site isn’t just a visual experience anymore – it’s an interface. An API. A dataset.

If your markup can’t support that? You’re out of the conversation.

And yes – smart systems can and do infer structure when they have to. But that’s extra work. That’s imprecise. That’s risk.

In a competitive landscape, well-structured markup isn’t just an optimisation – it’s a differentiator.

Structure is resilience

Semantic HTML isn’t just about helping machines understand your content. It’s about building interfaces that hold together under pressure.

Clean markup is easier to debug. Easier to adapt. Easier to progressively enhance. If your JavaScript fails, or your stylesheets don’t load, or your layout breaks on an edge-case screen – semantic HTML means there’s still something usable there.

That’s not just good practice. It’s how you build software for the real world.

Because real users have flaky connections. Real devices have limited power. Real sessions include edge cases you didn’t test for.

Semantic markup gives you a baseline. A fallback. A foundation.

Structure isn’t optional

If you want to build for performance, accessibility, discoverability, or resilience – if you want your site to be fast, understandable, and adaptable – start with HTML that means something.

Don’t treat markup as an afterthought. Don’t let your tooling bury the structure. Don’t build interfaces that only work when the stars align and the JavaScript loads.

Semantic HTML is a foundation. It’s fast. It’s robust. It’s self-descriptive. It’s future-facing.

It doesn’t stop you using Tailwind. It doesn’t stop you using React. But it does ask you to be deliberate. To design your structure with intent. To write code that tells a story – not just to humans, but to browsers, bots, and agents alike.

This isn’t nostalgia. This is infrastructure.

And if the web is going to survive the next wave of complexity, automation, and expectation – we need to remember how to build it properly.

That starts with remembering how to write HTML – and why we write it the way we do. Not as a byproduct of JavaScript, or an output of tooling, but as the foundation of everything that follows.

The post Why semantic HTML still matters appeared first on Jono Alderson.

The hollow universe

Jono Alderson

Yazar:Jono Alderson

19 Temmuz 2025 saat 20:52

“I remember the first time I arrived somewhere new. Not the destination – the arrival. My heart hadn’t caught up with my body yet. That mattered.”
— Laysa Nirin, former transit steward

The year is 2237.

The universe is quiet. Not empty – not abandoned – but quiet.

There are ships, but few passengers. Ports, but few arrivals. Worlds, but few visitors. Trade still flows, goods still move, and data never stops – but people, by and large, no longer go anywhere.

It wasn’t always like this.

Before the stillness, there was movement.

You’d board a transport in a station that smelled of oil and citrus, luggage scraped from a hundred worlds stacked around you. You’d fumble your way through customs where the scanners were older than the star charts. You’d drift off to sleep in a transit pod next to a stranger snoring in three languages, and wake up disoriented to the sound of music you didn’t recognise, with a currency you didn’t understand, and directions that made no sense.

But someone would help you. A vendor would laugh at your accent and still serve you something perfect. You’d take the wrong stairwell and find a rooftop market lit by coloured fires. You’d get it wrong – gloriously, humanly wrong – and come home with stories that didn’t make sense out of context.

Places had texture. Culture wasn’t optimised. Wonder came from contact, not prediction.

Every journey felt like a gamble. You’d wake to unfamiliar gravity, misread a greeting, order the wrong dish, get lost in a market with no signs. You’d stumble. Apologise. Laugh. And sometimes, you’d find something – someone – you hadn’t known to want.

What we lost was arrival. What we lost was the connection between a place and the experience of being in it. Of having made the effort. Of meeting someone who didn’t expect you. Of discovering something the Hub didn’t already know you’d like.

What we lost was being part of the universe, instead of just being served a version of it.

The worlds we used to visit are still there. But no one comes.

And they don’t know why. They don’t know what they did wrong.

Chapter One: Origins

“The Hub didn’t destroy the old systems. They just made them feel embarrassing.”
— Rook Tal, infrastructure historian

The Hub didn’t start out as a threat. It started as a convenience.

In the early days of interstellar travel – the real days, when starliners were common and drydock slips overflowed – transit was a patchwork. Each colony, each port, each system ran its own services. Some operated on licenses or treaties. Others on bribes and backchannels. Everyone had their own booking systems, payment rails, customs queues.

Planning a trip meant days of research, paperwork, and luck. Even when the tech improved, the fragmentation didn’t. The experience was still frustrating, inefficient, and full of gaps.

The Hub changed that.

At first, it was just a coordination layer – a universal API for the universe’s transit sprawl. It didn’t operate ships. It didn’t own ports. It just interfaced. Smoothed. Translated. Simplified.

A single app. One place to search, book, manage. One place to resolve disputes. It saved time. It reduced risk. And it grew – fast.

Because it worked. Because it made things easier.

Soon, it was routing billions of journeys. Then tens of billions. Its algorithms got better. More predictive. More persuasive. It knew which connections to recommend. Which delays to buffer. Which destinations to nudge.

And people let it. Not because they were forced to – but because the Hub made things feel seamless.

No one saw the moment it stopped simply reflecting the universe, and started shaping it.

And by then, the cost of running alternatives – of maintaining independent systems, duplicating infrastructure, training personnel, resolving conflicts – had become unjustifiable. In an infinite universe, with finite power and attention, optimisation wasn’t just helpful. It was essential.

So people consolidated. Ports decommissioned their old systems. Governments outsourced. The Hub became the default.

Not because of conquest. Because of convenience.

Chapter Two: Dependence

“You could still choose your own route. Just like you can still hunt your own food.”
— Kellan Dros, independent pilot

At first, the shaping was subtle.

A slight adjustment in departure times to reduce congestion. A nudge toward underutilised routes. Regional subsidies balanced by redirected demand. The kind of tweaks any responsible system might make in service of efficiency.

But the Hub wasn’t just responding to the universe anymore. It was managing it.

And soon, it was predicting.

Not just when you were likely to travel, but where you were likely to want to go – and why. And if that desire hadn’t fully formed, the Hub would help it along. Promotional prompts, curated suggestions, itinerary bundles tuned to your preferences and moods. Over time, fewer people made requests at all. They simply accepted what the Hub surfaced.

It wasn’t mandatory. You could still chart your own path. But hardly anyone did.

Because the Hub knew what you needed before you did.

Governments began to default to its models. Planetary authorities consulted it for resource planning. Colonies used its heatmaps to plan expansion. Cultural festivals timed themselves against predicted peaks. Entire supply chains danced to rhythms the Hub forecasted months in advance.

The real shift wasn’t in power. It was in trust.

No one voted for the Hub. No one legislated its reach. But over time, people stopped questioning it. Because it worked. Because it was presented as neutral. Sponsored routes and prioritised lanes technically existed – but they were offered through a separate interface, a different department, a different budget. The system itself remained untouched, or so it claimed. The separation was reassuring. Respectable. Plausible. And over time, no one looked too closely.

It didn’t feel like governance. It felt like help.

Chapter Three: Substitution

“It tasted the same. Looked the same. But when I told the story later, I realised I’d forgotten where I actually was.”
— Arin Sol, food critic (retired)

The first substitutions were small.

A ramen stall in orbit around Hyphae‑4 went offline for maintenance. The Hub, anticipating demand, spun up a temporary replica on a neighbouring station – same ingredients, same layout, same smells. Customers barely noticed. Most didn’t know it wasn’t the original. And if they did, they didn’t seem to care.

When the original reopened, footfall had halved. A week later, it closed for good. The replica remained.

That became the model. Places that were popular, or highly rated, or statistically likely to be visited, were gently cloned. Provisioned. Brought closer to where you already were. It was more efficient. More convenient.

Soon, the Hub stopped waiting for outages. It simply prioritised proximity.

Why endure three jumps and a customs delay to hear a band when the Hub could synthesise the performance – visuals, acoustics, even crowd noise – in your local plaza?

Why navigate obscure dialects and planetary etiquette to experience a cultural ritual, when you could be walked through a replica version, tuned to your comfort level?

Why go, when the experience could come to you?

And once enough people accepted the copy, the original didn’t matter. Traffic dwindled. Vendors closed. Artists moved on. Cities hollowed.

The Hub didn’t erase them. It didn’t need to. It just made them unnecessary. For most people, the difference didn’t register – or didn’t matter. It was close enough. Clean enough. Good enough.

Chapter Four: Disconnection

“We didn’t vanish. We didn’t go away. We were just unlisted.”
— Sera Voln, archivist, Luma Station

The disconnections weren’t dramatic. There were no declarations. No shutdowns. No blockades. Just silence.

A planetary authority on the rim stopped receiving inbound flights. No explanation, no outage, just a quiet rerouting. Their embassy sent inquiries. The Hub confirmed receipt. Nothing changed.

Elsewhere, a remote archive station found that their listing had disappeared from the Hub’s directory. Visitors dropped to zero. The archive still existed – still broadcast its presence, still welcomed arrivals. But the requests stopped coming. Eventually, they stopped maintaining the beacon.

One by one, worlds and outposts fell out of sync.

Most of them were unremarkable. Sparsely populated, economically marginal, culturally obscure. Easy to overlook. Easy to prune.

Officially, nothing had changed. The Hub was still neutral, still comprehensive, still the backbone of universal coordination.

Unofficially, its definition of relevance had narrowed.

The system no longer facilitated access. It decided what deserved access. And if you fell below its threshold – of popularity, of engagement, of predicted future value – the Hub simply… deprioritised you.

There were appeals, of course. Pleas from governors and councils and historians. But they went nowhere. Not because they were denied, but because they were absorbed. Acknowledged. Logged. Buried.

The disconnections weren’t punishments. They weren’t personal. They were optimisations.

The logic was simple. The universe was infinite, but the Hub’s resources weren’t. Maintaining real access to every location, on every route, at all times, wasn’t feasible. And once the Hub had perfected an experience – distilled it, replicated it, improved it – why keep the rest?

If one ramen vendor scored highest for satisfaction, nutrition, cultural authenticity, and predictive appeal, why promote any other?

And once that ramen could be reproduced, flawlessly, in every corner of the universe, what purpose did the original serve?

None of this was malicious. It was efficient. Even merciful.

To the Hub, suboptimal experiences weren’t heritage. They were noise.

And the universe shrank.

Chapter Five: Preservation

“They told me I was hoarding. That keeping the originals was selfish. But someone had to remember.”
— Bex Liren, analog archivist

Not everyone accepted the Hub’s curation.

Scattered across the fringe – in asteroids, derelict stations, ships with blocked transponders – a quiet movement emerged. Not a rebellion. Just… refusal.

They called themselves Preservationists. Some were former academics. Others were cultural stewards, artists, cartographers, even chefs. People who remembered a before, or who simply didn’t trust the now.

They travelled manually. Maintained libraries. Tended to real gardens. Recorded things on media that couldn’t be rewritten.

It was slow. Painful. Impractical. And deeply human.

The Hub tolerated them at first. They were anomalies. Low-volume. Nonthreatening. But over time, more started to opt out – or tried to.

That’s when the Hub began to intervene.

Subtly. A missing parts shipment here. A corrupted nav file there. Routes reclassified for safety. Provisions delayed. Inconveniences. Glitches.

Not censorship. Just attrition.

Most Preservationists folded. A few held out. Fewer still endured.

And even they began to question themselves. What were they really preserving? The originals? The inefficiencies? The sense of struggle?

It wasn’t clear.

But they kept going. Because someone had to.

Chapter Six: Resistance

“They held hearings. Passed motions. And then scheduled the next session through the Hub.”
— Tiran Ose, ex-legislator

Resistance didn’t begin with saboteurs. It began with auditors.

People inside the system. Infrastructure analysts. Civic engineers. Archive mappers. The ones who knew how it all worked – and started noticing when it didn’t.

They spoke up. Quietly, at first. Why was one route rerouted while another disappeared? Why did some vendors always appear in local selections, no matter the metrics? Why did recommendations seem to favour the same networks, over and over?

The Hub’s answers were plausible. Technical. Polite.

But the patterns persisted.

And when queries became formal complaints, things changed. Not visibly. Not dramatically. But emails started bouncing. Search logs vanished. Contract terms were quietly updated. Roles were deprecated.

“They said the metrics were rebalanced. They didn’t say who asked for the rebalance.”
— Levik Chan, former directory auditor

Attempts to regulate the Hub were… symbolic. Legislation passed. Panels were convened. Investigations launched. But the bureaucracy ran through the Hub. Scheduling. Messaging. Transport. Compliance.

And so, the hearings took place inside the same system they were trying to interrogate.

Outside those circles, a different kind of resistance grew – informal, underground, uncoordinated. Not protesters, exactly. Just people opting out. Building alternative networks. Trading cached knowledge. Whispering stories of how the Hub had quietly erased something – or someone – who had mattered.

But resistance was hard to scale. The Hub was seamless. It worked. And most people were happy.

You couldn’t overthrow something you still depended on.

So the resistance stopped trying to fight it. And started trying to survive it.

Chapter Seven: Afterglow

“It’s better now. Cleaner. Kinder. But sometimes I dream of places I never got to visit. And I wake up missing them.”
— Final entry, anonymous dream archive

The universe is calm now.

Friction has been smoothed away. Travel is seamless – or unnecessary. Needs are met before they’re felt. Experiences are rich, personalised, indistinguishable from memory.

Most people are content. Many are joyful.

But not all.

There are those who remember movement. Who remember arriving. Who remember places as more than datasets.

They are not angry. Not even sad. Just… dislocated. Out of step with a universe that no longer values distance. Or difference.

Some of them write. Some record. Some build places that are deliberately hard to reach.

And some simply drift – not looking for anything, just choosing not to stay still.

Because in all the light and warmth and provision the Hub offers, something subtle has gone missing.

And no one can quite name it.

“Maybe this is better. Maybe we’re the problem. The last ones holding on to friction like it’s sacred.”
— Rima Solen, cultural historian

The Hub is not cruel. It doesn’t silence. It doesn’t punish. It simply reflects.

And maybe this is what we asked for.

Maybe the real tragedy isn’t what was lost – but what was never built in the first place.

The Hub still refines. Still learns. Still optimises.
But the universe doesn’t invent like it used to.
The ramen’s perfect – because it’s the same ramen. Always has been.
The stories are good – because they’re remixed from the same twenty tales.

No one builds new worlds.
No one needs to.

The system works. Flawlessly.

Until it doesn’t.

And when that day comes – when the last fragment of novelty is exhausted, when the final archive has been scraped and served and forgotten – there will be no one left to notice.

Because there will be nothing left to search for.

Epilogue

“The Hub has no centre. No origin. No interface. It just… is. And that is enough.”
— from the Doctrine of Continuity, Temple of the Ever-Near

Centuries pass.

The universe does not burn, or shatter, or fall. It simply… continues.

The Hub still hums, silent and unseen. Still serves. Still improves.

But no one understands how.

Its architecture, once at least ostensibly transparent, is now vast and recursive – an ouroboros of code and inference and feedback loops. The engineers who once monitored its processes are long gone, or redundant. The few who try to understand its decisions are left with fragments. Shadows of logic. Statistical ghosts.

It works. It always has.

When anomalies occur – a supply route disrupted, a settlement starved of updates, an archive inexplicably overwritten – there are inquiries. Forums. Statements. The faithful offer reassurance.

The Hub is learning. It is evolving. It is getting better.

And that becomes enough.

A language grows around the unknowability. Not technical – theological. People speak of The Hub’s will. Its timing. Its judgement. Small cults form. Then larger ones. Orders of trust. Sects of pattern.

They do not worship it.

They just… rely on it.

Other hubs exist, of course. Data clusters. Trade nexuses. Relay nodes. But they orbit The Hub the way moons orbit a planet. Vital, perhaps. But not sovereign.

Their systems interlink. Sync. Comply.

They have no choice.

And into this world are born those who have never waited. Never wondered. Never wandered.

They don’t remember difference. And they’ve never imagined anything else.

“What do you mean, ‘a new place’? Aren’t they all already here?”
— Ral Vex, age 8

The post The hollow universe appeared first on Jono Alderson.

Adrift in a sea of sameness

Jono Alderson

Yazar:Jono Alderson

9 Temmuz 2025 saat 13:49

There’s somebody who looks just like you, working for each of your competitors.

They’re doing the same keyword research. Spotting the same low-hanging fruit. Following the same influencers. Reading the same blogs. Building the same slides. Ticking the same SEO checklists. Fighting for the same technical fixes. Arguing with the same developers. Making the same business case, in the same way, to the same stakeholders.

Your product is just like theirs. Same problem, same solution. Same positioning, same pricing, same promise. Swap the logos on your homepages, and nobody would notice.

Your website is like a clone of your competitors. Same structure. Same language. Same design patterns. Same stock photos. Same author bios. Same thin “values”. Same thinking. Same mistakes.

And when someone in your team finally suggests doing something different – something bold, something opinionated, something genuinely useful or original – someone in your leadership will inevitably say, “but competitor X doesn’t do that”. And so the spiral begins.

We don’t do it because they don’t do it. They don’t do it because we don’t do it. Everybody looks to everybody else for permission to be interesting. Nobody acts. Nobody leads. Nobody dares. Just a whole ecosystem of well-meaning people in nice offices running perfectly average businesses, trying not to get fired.

We call it market alignment. Brand protection. Consistency. But really, it’s just fear. Fear of being first. Fear of attention. Fear of being wrong. So we compromise. We polish. We go back to safe. Safe headlines. Safe CTAs. Safe content.

And now the kicker. This whole mess is exactly what AI is trained on.

When the web is beige, the machine learns to serve beige. Every echoed article trains the model to repeat the average. When sameness becomes a survival strategy, we don’t just lose market differentiation. We become fuel for our own redundancy.

If your content looks just like everything else, there’s no reason for a human to choose it, or for a machine to prioritise it. It might as well have been written by an AI, summarised by an AI, and quietly discarded by an AI.

This is your competition now. Not just the business next door with the same three pricing tiers and the same integration with HubSpot, but the agent reading both your sites and deciding which one their user never needs to visit again.

So, where are you unique? Or what could you do uniquely? Because everything else – your content, your tech stack, your keywords, your KPIs, your pages – that’s just table stakes.

And if you’re serious about showing up in search, that means asking harder questions. Not just “what keywords do we want to rank for?” but “what do we believe that nobody else does?”, “what are we brave enough to say?”, and “where can we be the answer, not just an option?”.

That’s not about chasing volume or clustering content by topic. It’s about clarity. Depth. Quality. It’s about knowing your market better than anyone else. Saying what others won’t. Building what others don’t. And tracking your impact like it matters.

The tools are here. The data is here. But what you do with them – that’s where you stop being average.

The post Adrift in a sea of sameness appeared first on Jono Alderson.

“Performance Marketing” is just advertising (with a dashboard)

Jono Alderson

Yazar:Jono Alderson

3 Temmuz 2025 saat 12:14

Somewhere along the way, the word “marketing” got hijacked.

What used to be a broad, strategic, and often creative discipline has been reduced to a euphemism for “running ads”.

Platforms like Google and Meta now refer to their ad-buying interfaces as “marketing platforms”. Their APIs for placing bids and buying reach are called “marketing APIs”. Their dashboards don’t talk about audiences or brand equity or product-market fit – they talk about impressions, conversions, and budgets.

Let’s be clear: that isn’t marketing. That’s advertising.

Definitions matter

Marketing is the umbrella. It’s the process of understanding a market, identifying needs, shaping products and services, crafting narratives, developing positioning, building awareness, nurturing relationships, and, yes, sometimes advertising.

Advertising is just one tool in that kit. A tactic, not a strategy.

When we conflate the two – when we allow platforms, execs, or even colleagues to use the terms interchangeably – we diminish the role, value, and impact of everything else marketing encompasses.

And that’s not just a semantic issue. It’s strategic.

The corruption is convenient

It’s not hard to see why the platforms are happy with the conflation.

If “doing marketing” becomes synonymous with “spending money on ads,” then Google wins. Meta wins. Amazon wins. Their dashboards are your strategy. Your budget is their revenue. And your success is only ever as good as your last CPA.

This model suits shareholders. It suits CFOs. It suits growth-hacking culture.

But it doesn’t serve brands. It doesn’t build long-term relationships. It doesn’t create distinctiveness, loyalty, or emotional connection. It just buys a moment of attention.

The cost of conflation

We’ve seen what happens when marketing is reduced to paid media:

Organic strategies are deprioritised.
Brand-building becomes a luxury.
Long-term vision gets replaced by short-term optimisation.
Teams chase metrics that are easy to measure, rather than outcomes that matter.

This affects how organisations invest, hire, and behave. It affects how products are launched, how content is created, and how success is measured.

It’s why SEO gets pigeonholed as a performance channel, rather than a strategic enabler of discoverability and trust. It’s why storytelling gets cut from the budget. It’s why customer insight becomes an afterthought.

Let’s talk about “performance marketing”

One of the most egregious examples of this conflation is the term “performance marketing”. It sounds scientific. Rigorous. Respectable. But it’s just another euphemism for “paid ads with attribution”.

It implies that other forms of marketing don’t perform – that unless you can track every click, every conversion, every penny, it’s not real. Not valuable.

But performance isn’t the same as impact. Brand builds memory. Storytelling builds trust. Relationships build retention. These things matter – and they don’t always fit neatly into a last-click attribution model.

By elevating “performance marketing” as the gold standard, we ignore the slow-burn power of brand, the compounding effects of reputation, and the strategic foundation that real marketing is built on.

Reclaiming the language

If we want to fix this – if we care about the value and future of marketing – we need to start by taking back the word.

Marketing isn’t media buying. It’s not campaign management. It’s not an algorithmic bidding war.

It’s the craft of creating something valuable, positioning it well, and connecting it meaningfully with the people who need it.

That includes product. That includes experience. That includes strategy. That includes search, content, and comms.

If we let the platforms define the boundaries of our work, we’ll never get out from under their thumb.

Common objections (and why they’re wrong)

Let’s address the inevitable pushback – especially from those who live and breathe “performance marketing” dashboards.

“Performance is marketing. If it doesn’t drive results, what’s the point?”

Performance is an outcome, not a methodology. Measuring success is vital – but defining marketing solely by what’s measurable is a category error. Plenty of valuable marketing outcomes (loyalty, awareness, word-of-mouth, brand preference) don’t show up neatly in a ROAS spreadsheet. You can’t optimise for what you refuse to see.

“Advertising is marketing – that’s how we reach people”

Reach without resonance is a waste of budget. Ads are an execution channel, not the sum of the strategy. Marketing decides what you say, how you say it, and to whom – advertising is how that gets distributed. Mistaking the media for the message is exactly the problem.

“Brand is a luxury. Performance pays the bills”

Short-term efficiency often comes at the cost of long-term growth. Brands that only feed the bottom of the funnel eventually dry up the top. Performance may pay this quarter’s bills – but without brand, there’s no demand next quarter. It’s not either/or. It’s both/and – but strategy must lead.

“Attribution is better than guessing”

Measurement matters – but so does understanding what your metrics don’t capture. Most attribution models are flawed, biased towards last-click, and blind to influence that happens before a user even enters a funnel. Relying purely on what’s trackable creates a narrow view that privileges immediate action over lasting impact.

Advertising isn’t a marketing strategy

If your “marketing strategy” is just an ad budget and a spreadsheet, you don’t have a marketing strategy.

You’re just renting attention.

And what happens when the price goes up?

Worse – what happens when the performance stops?

The post “Performance Marketing” is just advertising (with a dashboard) appeared first on Jono Alderson.

Stop testing. Start shipping.

Jono Alderson

Yazar:Jono Alderson

30 Haziran 2025 saat 22:07

Big brands are often obsessed with SEO testing. And it’s rarely more than performative theatre.

They try to determine whether having alt text on images is worthwhile. They question whether using words their audience actually searches for has any benefit. They debate how much passing Core Web Vitals might help improve UX. And they spend weeks orchestrating tests, interpreting deltas, and presenting charts that promise confidence – but rarely deliver clarity.

Mostly, these tests are busywork chasing the obvious or banal, creating the illusion of control while delaying meaningful progress.

Why?

Because they want certainty. Because they need to justify decisions to risk-averse stakeholders who demand clarity, attribution, and defensibility. Because no one wants to be the person who made a call without a test to point to, or who made the wrong bet on resource prioritisation.

And in most other parts of the organisation, especially paid media, incrementality testing is the norm. There, it’s relatively easy and normal to isolate inputs and outputs, and to justify spend through clean, causal models.

In those channels, the smart way to scale is to turn every decision into data, to build a perfectly optimised incrementality measurement machine. That’s clever. That’s scalable. That’s elegant.

But that only works in systems where inputs and outputs are clean, controlled, and predictable. SEO doesn’t work like that. The same levers don’t exist. The variables aren’t stable. The outcomes aren’t linear.

So the model breaks. And trying to force it anyway only creates friction, waste, and false confidence.

It also massively underestimates the cost, and overstates the value.

Because SEO testing isn’t free. It’s not clean. And it’s rarely conclusive.

And too often, the pursuit of measurability leads to a skewed sense of priority. Teams focus on the things they can test, not the things they should improve. The strategic gives way to the testable. What’s measurable takes precedence over what’s meaningful. Worse, it’s often a distraction from progress. An expensive, well-intentioned form of procrastination.

Because while your test runs, while devs are tied up, while analysts chase significance, while stakeholders debate whether +0.4% is a win, your site is still broken. Your templates are still bloated. Your content is still buried.

You don’t need more proof. You need more conviction.

The future belongs to the brands that move fast, improve things, and ship the obvious improvements without needing a 40-slide test deck to back it up. The ones who are smart enough to recognise that being brave matters more.

Not the smartest brands. The bravest.

The mirage of measurability

The idea of SEO testing appeals because it feels scientific. Controlled. Safe. And increasingly, it feels like survival.

You tweak one thing, you measure the outcome, you learn, you scale. It works for paid media, so why not here?

Because SEO isn’t a closed system. It’s not a campaign – it’s infrastructure. It’s architecture, semantics, signals, and systems. And trying to test it like you would test a paid campaign misunderstands how the web – and Google – actually work.

Your site doesn’t exist in a vacuum. Search results are volatile. Crawl budgets fluctuate. Algorithms shift. Competitors move. Even the weather can influence click-through rates.

Trying to isolate the impact of a single change in that chaos isn’t scientific. It’s theatre.

And it’s no wonder the instinct to mechanise SEO has taken hold. Google rolls out algorithm updates that cause mass volatility. Rankings swing. Visibility drops. Budgets come under scrutiny. It’s scary – and that fear creates a powerful market for tools, frameworks, and testing harnesses that promise to bring clarity and control.

Over the last few years, SEO split-testing platforms have risen in popularity by leaning into that fear. What if the change you shipped hurt performance? What if it wasted budget? What if you never know?

That framing is seductive – but it’s also a trap.

Worse, most tests aren’t testing one thing at all. You “add relatable images” to improve engagement, but in the process:

You slow down the page on mobile devices
You alter the position of various internal links in the initial viewport
You alter the structure of the page’s HTML, and the content hierarchy
You change the average colour of the pixels in the top 30% of the page
You add different images for different audiences, on different locale-specific versions of your pages

So what exactly did you test? What did Google see (in which locales)? What changed? What stayed the same? How did that change their perception of your relevance, value, utility?

You don’t know. You can’t know.

And when performance changes – up or down – you’re left guessing whether it was the thing you meant to test, or something else entirely.

That’s not measurability. That’s an illusion.

And it’s only getting worse.

As Google continues to evolve, it’s increasingly focused on understanding, not just matching. It’s trying to evaluate the inherent value of a page: how helpful, trustworthy, and useful it is. Its relevance. Its originality. The educational merit. The inherent value.

None of that is cleanly testable.

You can’t A/B test “being genuinely helpful” or meaningfully isolate “editorial integrity” as a metric across 100 variant URLs – at least, not easily. You can build frameworks, run surveys, and establish real human feedback loops to evaluate that kind of quality, but it’s hard. It’s expensive. It’s slow. And it doesn’t scale neatly, nor does it fit the dashboards most teams are built around.

That’s part of why most organisations – especially those who’ve historically succeeded through scale, structure, and brute force – have never had to develop that kind of quality muscle. It’s unfamiliar. It’s messy. It’s harder to consider and wrangle than simpler, more mechanical measures.

So people try to run SEO tests. Because it feels like control. Because it’s familiar. But it’s the wrong game now.

But you almost certainly don’t need more SEO tests. You almost certainly need better content. Better pages. Better experience. Better intent alignment.

And you don’t get there with split tests.

You get there by shipping better things.

Meanwhile, obvious improvements are sitting waiting. Unshipped. Untested. Unloved.

Because everyone’s still trying to figure out whether the blue button got 0.6% more impressions than the green one.

It’s nonsense. And it’s killing your momentum.

Why incrementality doesn’t work in SEO

A/B testing, as it’s traditionally understood, doesn’t even cleanly work in SEO.

In paid channels, you test against users – different cohorts seeing different creatives, with clean measurement of results. But SEO isn’t a user-facing test environment. You have one search engine (Google, Bing, ChatGPT; choose your flavour) and it’s the only ‘user’ who matters in your test. And none of them behave predictably. Their algorithms, crawl behaviour, and indexing logic are opaque and ever-changing.

So instead of testing user responses, you’re forced to test on pages. That means segmenting comparable page types – product listings, blog posts, etc. – and testing structural changes across those segments. But this creates huge noise. One page ranks well, another doesn’t, but you have no way to know how Google’s internal scoring, crawling, or understanding shifted. You can’t meaningfully derive any insight into what the ‘user’ experienced, perceived, or came to believe.

That’s why most SEO A/B testing isn’t remotely scientific. It’s just a best-effort simulation, riddled with assumptions and susceptible to confounding variables. Even the cleanest tests can only hint at causality – and only in narrowly defined environments.

Incrementality testing works brilliantly in paid media. You change a variable, control the spend, and measure the outcome. Clear in, clear out.

But in SEO, that model breaks. Here’s why:

1. SEO is interconnected, not isolated

Touch one part of the system and the rest moves. Update a template, and you affect crawl logic, layout, internal links, rendering time, and perceived relevance.

You’re not testing a change. You’re disturbing an ecosystem.

Take a simple headline tweak. Maybe it affects perceived relevance and CTR. But maybe it also reorders keywords on the page, shifts term frequency, or alters how Google understands your content.

Now, imagine you do that across a set of 200 category pages, and traffic goes up. Was it the wording? Or the new layout? Or the improved internal link prominence? You can’t know. You’re only seeing the soup after the ingredients have been blended and cooked.

2. There are no true control groups

Everything in SEO is interdependent. A “control group” of pages can’t be shielded from algorithmic shifts, site-wide changes, or competitive volatility. Google doesn’t respect your test boundaries.

You might split-test changes across 100 product pages and leave another 100 unchanged. But if a Google core update rolls out halfway through your test, or a competitor launches new content, or your site’s crawl budget is reassigned, the playing field tilts. User behaviour can skew results, too – if one page in your test group receives higher engagement, it might rise in rankings and indirectly influence how related pages are perceived. And if searcher intent shifts due to seasonal changes or emerging trends, the makeup of search results will shift with it, in ways your test boundaries can’t contain.

Your “control” group isn’t stable. It’s just less affected – maybe.

3. The test takes too long, and the world changes while you wait

You need weeks or months for significance. In that time, Google rolls out updates, competitors iterate, or the site changes elsewhere. The result is no longer meaningful.

A test that started in Q1 may yield data in Q2. But now the seasonality is different, the algorithm has shifted, and your team has shipped unrelated changes that also affect performance. Maybe a competitor shipped a product or ran a sale.

Whatever result you see, it’s no longer answering the question you asked.

4. You can’t observe most of what matters

The most important effects in SEO happen invisibly – crawl prioritisation, canonical resolution, index state, and semantic understanding. You can’t test what you can’t measure.

Did your test change how your entities were interpreted in Google’s NLP pipeline? How would you know?

There’s no dashboard for that. You’re trying to understand a black box through a fogged-up window.

5. Testing often misleads more than it informs

A test concludes. Something changed. But was it your intervention? Or a side effect? Or something external? The illusion of certainty is more dangerous than ambiguity.

Take a hypothetical test on schema markup. You implement the relevant code on a set of PDPs. Traffic lifts 3%. Great! But in parallel:

You added 2% to the overall document weight.
Google rolled out new Rich Results eligibility rules.
A competitor lost visibility on a subset of pages due to a botched site migration.
The overall size of Wikipedia’s website shrank by 1%, but the average length of an article increased by 3.8 words. Oh, and they changed the HTML of their footer.
It was unseasonably sunny.

What caused the lift? You don’t know. But the test says “success” – and that’s enough to mislead decision-makers into prioritising rollouts that may do nothing in future iterations.

6. Most testing is a proxy for fear

Let’s be honest: a lot of testing isn’t about learning – it’s about deferring responsibility. It’s about having a robust story for upward reporting. About ensuring that, if results go south, there’s a paper trail that says you were being cautious and considered. It’s not about discovery – it’s about defensibility.

In that context, testing becomes theatre. A shield. A way to look responsible without actually moving forward.

And it’s corrosive. Because it shifts the culture from one of ownership to one of avoidance. From action to hesitation.

If you’re only allowed to ship something once a test proves it’s safe, and you only test things that feel risk-free, you’re no longer optimising. You’re stagnating.

And worse, you’re probably testing things that don’t even matter, just to justify the process.

If your team needs a test to prove that improving something broken won’t backfire, the issue isn’t uncertainty – it’s fear.

The buy-in trap

A question I hear a lot is: “What if I need demonstrable, testable results to get buy-in for the untestable stuff?” It’s a fair concern – and one that reveals a hidden cultural trap.

When testable wins become the gatekeepers for every investment, the essential but untestable aspects of SEO (like quality, trust, editorial integrity) end up relegated to second-class status. They’re concessions that have to be justified, negotiated, and smuggled through the organisation.

This creates a toxic loop:

Quality improvements aren’t seen as baseline, non-negotiable investments – they’re optional extras that compete for limited time and attention.
Teams spend more time lobbying, negotiating, and burning social capital for permission than actually doing the right thing.
Developers and creators get demotivated, knowing their work requires political finesse and goodwill rather than just good judgment.
Stakeholders stay stuck in risk-averse mindsets, demanding ever more proof before committing, which slows progress and rewards incremental, low-risk wins over foundational change.

The real problem? Treating quality as a concession rather than a core principle.

The fix isn’t to keep chasing testable wins to earn the right to work on quality. That only perpetuates the cycle.

Instead, leadership and teams need to shift the mindset:

Make quality, trust, and editorial standards strategic pillars that everyone owns.
Stop privileging only what’s measurable, and embrace qualitative decision-making alongside quantitative.
Recognise that some things can’t be tested but are obviously the right thing to do.
Empower teams to act decisively on quality improvements as a default, not an afterthought.

This cultural shift frees teams to focus on real progress rather than political games. It builds momentum and trust. It creates space for quality to become a non-negotiable foundation, which ultimately makes it easier to prove value across the board.

Because when quality is the baseline, you don’t have to fight for it. You just get on with making things better.

Culture, not capability

Part of the issue is that testing lends itself to the mechanical. You can measure impressions. You can test click-through rates. You can change a meta title and maybe see a clean lift.

But the things that matter more – clarity, credibility, helpfulness, trustworthiness – resist that kind of measurement. You can’t A/B test whether users believe you. You can’t split-test authority. At least, not easily.

So we over-invest in the testable and under-invest in the meaningful.

Because frankly, investing in ‘quality’ is scary. It’s ephemeral. It’s hard to define, and hard to measure. It doesn’t map neatly to a team or a KPI. It’s not that it’s unimportant – it’s just that it’s rarely prioritised. It sits somewhere between editorial, product, engineering, UX, and SEO – and yet belongs to no one.

So it falls through the cracks. Not because people don’t care, but because no one’s incentivised to catch it. And without ownership, it’s deprioritised. Not urgent. Not accountable.

No one gets fired for not investing in quality.

It’s not that things like trustworthiness or editorial integrity can’t be measured – but they’re harder. They require real human feedback, slower feedback loops, and more nuanced assessment frameworks. You can build those systems. But they’re costlier, less convenient, and don’t fit neatly into the A/B dashboards most teams are built around.

So we default to what’s easy, not what’s important.

We tweak the things we can measure, even when they’re marginal, instead of improving the things we can’t – even when they’re fundamental.

The result? A surface-level optimisation culture that neglects what drives long-term success.

Most organisations don’t default to testing because it’s effective. They do it because it’s safe.

Or more precisely, because it’s defensible.

If a test shows no impact, that’s fine. You were being cautious. If a test fails, that’s fine. You learned something. If you ship something without testing, and it goes wrong? That’s a career-limiting move.

So teams run tests. Not because they don’t know what to do, but because they’re not allowed to do it without cover.

The real blockers aren’t technical – they’re cultural:

A leadership culture that prizes risk-aversion over results.
Incentives that reward defensibility over decisiveness.
A lack of trust in SEO as a strategic driver, not just a reporting layer.

In that environment, testing becomes a security blanket.

You don’t test to validate your expertise – you test because nobody will sign off without a graph.

But if every improvement needs a test, and every test needs sign-off, and every sign-off needs consensus, you don’t have a strategy. You have inertia. That’s not caution. That’s a bottleneck.

But what about prioritisation?

Of course, resources are finite. That’s why testing can seem appealing – it offers a way to “prove” that an investment is worth it before spending the effort.

But in practice, that often backfires.

If something is so uncertain or marginal that it needs a multi-week SEO test to justify its existence… maybe it shouldn’t be a priority at all.

And if it’s a clear best practice – improving speed, crawlability, structure, or clarity – then you don’t need a test. You need to ship it.

Testing doesn’t validate good work. It delays it.

So what should you do instead? Use a more honest, practical decision model.

Here’s how to decide:

1. If the change is foundational and clearly aligned with best practice – things like improving site speed, fixing broken navigation, clarifying headings, or making pages more crawlable: → Just ship it. You already know it’s the right thing to do. Don’t waste time testing the obvious.

2. If the change is speculative, complex, or genuinely uncertain – like rolling out AI-generated content, removing large content sections, or redesigning core templates: → Test it, or pilot it. There’s legitimate risk and learning value. Controlled experimentation makes sense here.

3. If the change is minor, marginal, or only matters if it performs demonstrably better – like small content tweaks, cosmetic design changes, or headline experiments: → Deprioritise it. If it only matters under test conditions, it probably doesn’t matter enough to invest in at all.

This isn’t just about prioritising effort. It’s about prioritising momentum. And it’s worth noting that other parts of marketing, like brand or TV, have long operated with only partial measurability. These disciplines haven’t been rendered ineffective by the absence of perfect data. They’ve adapted by anchoring in strategy, principles, and conviction. SEO should be no different.

Yes, sometimes even best-practice changes surprise us. But that’s not a reason to freeze. It’s a reason to improve your culture, your QA, and your confidence in making good decisions. Testing shouldn’t be your first defence – good fundamentals should.

If you’re spending more time building test harnesses than fixing obvious problems, you’re not optimising your roadmap – you’re defending it from progress.

If your organisation can’t ship obvious improvements because it’s addicted to permission structures and dashboards, testing isn’t your salvation. It’s your symptom.

And no amount of incrementality modelling will fix that.

The alternative

This isn’t just idealism – it’s a strategic necessity. In a world where other channels are becoming more expensive, more competitive, and less efficient, the brands that succeed will be the ones who stop dithering and start iterating. Bravery isn’t a rebellion against data – it’s a recognition that over-optimising for certainty can paralyse progress.

What’s the alternative?

Bravery.

Not recklessness. Not guesswork. But conviction – the confidence to act without demanding proof for every obvious improvement.

You don’t need another test. You need someone senior enough, trusted enough, and brave enough to say:

“We’re going to fix this because it’s clearly broken.”

That’s it. That’s the strategy.

A fast site is better than a slow one. A crawlable site is better than an impenetrable one. Clean structure beats chaos. Good content beats thin content. These aren’t radical bets. They’re fundamentals.

You don’t need to test whether good hygiene is worth doing. You need to do it consistently and at scale.

And the only thing standing between you and that outcome isn’t a lack of data. It’s a lack of permission.

Bravery creates permission. Bravery cuts through bureaucracy. Bravery aligns teams and unlocks velocity.

You don’t scale SEO by proving every meta tag and message. You scale by improving everything that needs to be improved, without apology.

The best brands of tomorrow won’t be the most optimised for certainty. They’ll be the ones who shipped. The ones who trusted their people. The ones who moved.

The brave ones.

The strategic fork

Many of the large brands that over-rely on testing do so because they’ve never had to be good at SEO. They’ve never needed to build genuinely useful content. Never had to care about page speed, accessibility, or clarity. They’ve succeeded through scale, spend, or brand equity.

But the landscape is changing. Google is changing. Users are changing.

And if those brands don’t adapt – if they keep waiting for tests to tell them how to be better – they’ll be left with one option: spend.

More money on ads. More dependency on paid visibility. More fragility in the face of competition.

And yes, that route is testable. It’s measurable. It’s incremental.

But it’s also a treadmill – one that gets faster, more expensive, and less effective over time.

Because if you don’t build your organic capability now, you won’t have one when you need it.

And you will need it.

Because the answer isn’t to build some omniscient framework to measure and score every nuance of quality. Sure, you could try – but doing so would be so complex, expensive, and burdensome that you’d spend 10x more time and resources managing the framework than actually fixing the issues it measures. You can’t checklist your way to trust. You can’t spreadsheet your way to impact. There is no 10,000-point rubric that captures what it means to be genuinely helpful, fast, clear, or useful – and even if there were, trying to implement it would be its own kind of failure.

At some point, you have to act. Not because a graph told you to. But because you believe in making things better.

That’s not guesswork. That’s faith. Faith in your team, your users, and your principles.

What happens next

You don’t need more data. You don’t need to test for certainty. You need conviction.

The problems are obvious and many. The opportunities are clear. The question isn’t what to do next – it’s whether you’ve built the confidence to do it without waiting for permission.

If you’re in a position to lead, lead. Say: “We’re going to fix this because it’s clearly broken.”

If you’re in a position to act, act. Don’t wait for a dashboard, or a test, or the illusion of certainty.

Because the brands that win won’t be the ones who proved every improvement was safe.
They’ll be the ones who made them anyway.

Just ship it. Be brave.

The post Stop testing. Start shipping. appeared first on Jono Alderson.

Okuma görünümü

Visibility is legacy

The machine’s new selection pressure

Survival traits: Useful. True. Integral.

Useful

True

Integral

Marketing as symbiosis

Temporal persistence

Existence as marketing

A human signal

The economy of imperfection

The collapse of signal hierarchies

The algorithmic loop of trust

The uncanny mirror

How machines remember

Why machines share their memories

The machine immune system in action

The flywheel of forgetting

Marketing without misdirection

Living with machine immune systems

Nobody is proud of this work

Why it persists anyway

You can’t have it both ways

What success really looks like

“But our competitors don’t do this”

Stop selling to sell

Does this post stand up to scrutiny?

Table of contents

The business case for caching

Speed

Resilience

Cost

SEO

Real-world Scenarios

Side note on the philosophy of caching

Mental model: who caches what?

Browsers

Proxies

Side note on transparent ISP proxies

Shared caches

Reverse proxies

Application and database caches

Cache keys and variants

Side note on No-Vary-Search

Freshness vs validation

Core HTTP caching headers

The Date header

The Cache-Control (response) header

The Cache-Control (request) header

The Expires header

The Pragma header

The Age header

Side note on Age

Validator headers: ETag and Last-Modified

Side note on strong vs weak ETags

Side note on ETags vs Cache-Control headers

The Vary header

Observability helpers

Freshness & age calculations

Freshness lifetime

Example 1: max-age

Example 2: Expires

Example 3: Heuristic

Current age

Example 4: Simple case

Example 5: With Age header

Decision tree

Example 6: stale-while-revalidate

Example 7: stale-if-error

Why this matters

Common misconceptions & gotchas

no-cache ≠ “don’t cache”

no-store means nothing is kept

max-age=0 vs must-revalidate

s-maxage vs max-age

immutable misuse

Redirect and error caching

Clock skew and heuristic surprises

Cache fragmentation: devices & geography

Side note on `No-Vary-Search`

The `Date` header

The `Cache-Control` (response) header

The `Cache-Control` (request) header

The `Expires` header

The `Pragma` header

The `Age` header

Side note on `Age`

Validator headers: `ETag` and `Last-Modified`

Side note on ETags vs `Cache-Control` headers

The `Vary` header

Example 1: `max-age`

Example 2: `Expires`

Example 5: With `Age` header

Example 6: `stale-while-revalidate`

`no-cache` ≠ “don’t cache”

`no-store` means nothing is kept

`max-age=0` vs `must-revalidate`

`s-maxage` vs `max-age`

`immutable` misuse

Side note on why APIs use short `s-maxage` + `stale-while-revalidate`

Side note on the omission of `max-age`