The open web has always run on a simple bargain: publishers let search engines crawl their pages so that humans can be sent back in return. Those humans see display ads on the publishers’ sites. The more humans who see ads, the more money the publishers make. Search engines honoured that bargain for 20 years. (And publishers, for their part, often showed the referrers irrational disrespect and lobbied for dumb legislation.)
But now, those same search engines and new AI agents increasingly do not. The latest State of the Bots report for Q1‑2025 from Tollbit lays bare how far the numbers have tipped and why the web, as we know it, is on life support.
Ratio’d to Death
Nothing puts the problem in relief more than the crawl‑to‑visit ratios. In the last quarter, Microsoft Bing already needed 11 scrapes to deliver a single human click. That’s bad, but it’s charity compared to the newcomers:
- OpenAI: 179 scrapes per visit
- Perplexity: 369 scrapes per visit
- Anthropic: a mind‑numbing 8,692 scrapes per visit
Those gaps are widening fast. OpenAI’s ratio is the best but still 16× worse than Bing; Perplexity’s more than doubled in three months according to Tollbit; Anthropic’s jumped by 48 %. Now content is being hoovered off of publishers’ servers and straight into large language model memory, with almost nothing given in return.
A Broken Value Exchange
Publishers absorb the bandwidth bills, the content delivery network costs and the ad‑fraud headaches created by bot hits. They put up with it because Google returned up to 85 % of all external referrals. But AI agents now account for roughly 0.04 % of referral traffic, a rounding error, while Google’s own share is sliding each quarter. The old compact of “let me crawl and I’ll send eyeballs” has been torched. The crawlers keep coming but real people do not.
Robots.txt? EFF You!
Publishers have fought back by ramping up the number of AI bots disallowed by robots.txt over the past year. It hasn’t mattered – the LLMs just find shifty ways in. The report shows disallowed agents continuing to scrape, often through third‑party fetchers or residential IPs. Some vendors now explicitly state their bots will ignore robots.txt when fetching “on behalf of a user.” Translation: Eff you, publishers.
A vicious circle gets more vicious
Every non‑paywalled publisher is effectively subsidizing LLMs. Server logs swell with automated hits that generate zero ad impressions, zero subscriptions and zero first‑party data. Infrastructure isn’t free. When bot traffic doubles in a quarter a mid‑size news site can watch its content delivery network (CDN) bill spike while its revenue line sags. For already cash‑strapped local publishers, that’s a death sentence.
At the same time, AI answer engines make it less likely that a user ever needs to click through. Fewer visits mean lower ad revenue which forces even more annoying ad formats, which in turn will further dwindle human traffic. It’s a vicious loop.
Google Itself Is Changing
Google’s own AI‑generated “Overviews” are already siphoning clicks on high‑value queries. If Bing’s ratios look ugly today, imagine what happens when Google flips more of its search results into zero‑click answers. The last dependable firehose of traffic will run dry, and the open‑link architecture that built the web’s will collapse into a series of walled gardens.
How the Web Ends
If you’re a publisher, here are the hard truths:
- Traffic will keep decoupling from crawling. Scrapes will grow because LLMs need fresh data; visits will not
- Ad revenue will crater. Programmatic ads revenue relies on traffic. Traffic disappears when users don’t click through
- Paywalls and licensing will surge. Either LLMs pay, or content retreats behind registration walls. So much for net neutrality
- The public web shrinks. A smaller, poorer information commons and a lot more AI hallucinations trained on stale or blocked data is the logical outcome. (We already saw this happen in Canada when Trudeau tried to force Facebook to subsidize media conglomerates)
So the web doesn’t die with a bang. It just bleeds out through a widening visits‑to‑scrapes deficit. Unless regulators enforce meaningful consent (won’t happen), or AI firms start paying fair market rates for content (won’t happen), publishers will be forced to pull up the drawbridge. Quality content on the web will only be for the few who can afford it.
And what do the LLMs do then? Honest question.