· AI citationsGEOAI search

Fifteen websites decide what AI says about you

In May 2026, one index of more than 680 million AI citations found that fifteen domains account for 68% of them. On AI's narrow trust list, and the discipline it now asks of a brand. The first of three field notes from building CLEO.

Fifteen websites decide what AI says about you

On AI’s narrow trust list, and the discipline it now asks of a brand. The first of three field notes from building CLEO.

When I started building CLEO, I believed the same thing most people building for search believe: that if you get a website clean enough - fast, structured, well-written - the engines will find it and quote it. I spent the first stretch of the build making sites legible to AI. Schema, entities, the machine-readable scaffolding that lets a model understand what a page is. The work was good. It was also, I came to realise, only a third of the job. This is the first of three things I got wrong, and what each one taught me.

Here is the fact that reorganised my thinking. In May 2026, a New York firm called 5WPR consolidated six of the largest studies of AI citations - more than 680 million of them, across ChatGPT, Claude, Perplexity, Gemini, and Google’s AI Overviews - into a single index. The question was plain: when these engines quote a source, which sources do they actually quote?

Fifteen. Fifteen domains account for 68% of every citation those engines produce.

Not the top hundred. Not the top thousand. Fifteen. Nothing in the old web looked like this - classic Google search spread authority across millions of domains in a long tail. AI citation does the opposite. It funnels.

And the funnel is not where most brands are spending. Reddit sits at the top, cited around 40% of the time across every major engine. Then Wikipedia, YouTube, LinkedIn, Google’s own properties. The rest of the fifteen are journalism and community platforms - Reuters, Forbes, Quora, Stack Overflow. What is missing from the list matters more than what is on it. Your blog is not there. Almost no individual brand site is there. Paid placements - sponsored posts, advertorials - account for less than half a percent.

This is the part that is hard to accept on first reading. You can spend half your content budget on a beautiful blog and be invisible to almost every AI citation. You can buy a placement in a magazine your buyers respect and earn nothing an engine will read. The list does not care what you paid. It cares whether you are on it.

Once I saw why, the list stopped looking arbitrary. Models learn to trust the shape of human reasoning - a question, several answers, disagreement, a signal of which answer held up. Reddit produces that shape continuously. Wikipedia produces a different one: an entity defined, claims sourced, history visible. A YouTube transcript produces a third: one person, answering one question, in their own voice. A corporate blog post produces none of them. It is a single voice with no contradiction and no one checking - and a model can tell. The blog is not punished. It is simply weighted as what it is: marketing, not evidence of what people think.

Then there is the part that should unsettle anyone planning a year ahead. The list moves in weeks. The same index documented ChatGPT’s Reddit citation share falling from roughly 60% to 10% in six weeks in late 2025, after a single Google parameter change. Six weeks. Anyone who built strategy in the SEO era is not braced for that clock speed - Google’s old upheavals took months, and you could watch them coming. A brand that had bet everything on Reddit woke up, six weeks later, with its main source cut by five-sixths.

So the defensible position is not dominance on one platform. It is presence across the fifteen, with the infrastructure that lets an engine recognise you as the same entity wherever you appear - and the discipline to hold that as the list shifts underneath you. This is the work I had started on without yet understanding its full shape: not content production, but something closer to infrastructure maintenance, never finished, on a short and moving list.

We built CLEO to do that part - the measuring and the readiness - continuously, at the standard the engines now demand. That is the first third of the job. The second third is the one that stopped me, and it is the next piece.

More on how we think about this at regencleo.ai.

About this article - Fifteen websites decide what AI says about you

In May 2026, one index of more than 680 million AI citations found that fifteen domains account for 68% of them. On AI's narrow trust list, and the discipline it now asks of a brand. The first of three field notes from building CLEO.

Article details

Published June 12, 2026 by CLEO. Part of The Field Notes - the working journal of the CLEO Presence Engine at regencleo.ai/articles. Topics covered: AI citations, GEO, AI search, Reddit, brand visibility, founder notes.

Published on The Field Notes at regencleo.ai/articles. Learn more about the CLEO Presence Engine at regencleo.ai/engine. Methodology and scoring at regencleo.ai/methodology.