This is part 2 of our GPT-5.5 citation study. Part 1 covered the Thinking tier, what changed when ChatGPT replaced GPT-5.4 with GPT-5.5 as the “Latest” Thinking model.
ChatGPT replaced GPT-5.3 with GPT-5.5 as the default Instant model in early May 2026. The change wasn’t announced as a behavioral shift. It was framed as a model upgrade.
The data tells a different story.
GPT-5.5 Instant cites brand websites half as often as GPT-5.3 Instant did. Reddit went from a peripheral source on the previous default to ChatGPT’s most-cited domain by a 3x margin. And 16% of the prompts we ran on Instant got silently routed to the Thinking tier without telling the user, even with the Auto-switch toggle disabled.
The free-tier ChatGPT experience your customers are using right now is fundamentally different from the one they were using two weeks ago. Most brands haven’t noticed.
How we did this
We ran 50 prompts through GPT-5.5 Instant. Same prompt set as our broader 5.5 vs 5.4 vs 5.3 study, covering SaaS, ecommerce, healthcare, finance, travel, education, home, food, legal, marketing, productivity, fitness, shopping intent, head-to-head comparisons, and trends.
42 of the 50 conversations ran cleanly on GPT-5.5 Instant (model_slug = gpt-5-5). 8 escalated mid-conversation to GPT-5.5 Thinking even though we hadn’t enabled Auto-switch. We re-ran those 8 in fresh sessions with the toggle explicitly disabled. 7 of 8 escalated again. Routing is content-based, not random.
The 42 clean conversations form the basis of this analysis.
For each conversation, we pulled the full payload from ChatGPT’s /backend-api/conversation/<id> endpoint. Every fan-out query, every web result, every cited URL. We classified each cited URL as first-party (the brand the user asked about) or third-party (review sites, blogs, Reddit, retailers, media outlets) using Claude Haiku 4.5.
What we measured
Count
Conversations attempted
50
Conversations that ran cleanly on Instant
42
Conversations that auto-escalated to Thinking
8
Re-run prompts that escalated again
7 of 8
Citations classified
~235
Search-engine cross-reference (SerpAPI)
30 prompts × Google + Bing
Now here’s what we found.
Brand citations halved
GPT-5.5 Instant cites brand websites 6% of the time. GPT-5.3 Instant cited them 13.4%.
Metric
GPT-5.3 Instant
GPT-5.5 Instant
Δ
First-party citation %
13.4%
6.0%
−7.4 pp
Avg citations in final answer
8.5
5.6
−34%
Avg fan-out queries
1.0
1.0
flat
Avg web results read
12.3
16.1
+31%
site: operator usage
0%
0%
flat
Pricing-page citations
0%
0%
flat
Search used (% of convos)
98%
100%
+2 pp
On 5.5 Instant, ~94 of every 100 cited URLs go to a third-party source. Not a brand site. That’s down from ~87 of every 100 on 5.3 Instant.
Both Instant tiers cite zero pricing pages across all conversations. The Instant search infrastructure simply doesn’t reach into brand pricing structures. If pricing-page traffic from ChatGPT matters to your business, it’s coming from Plus and Pro users on Thinking, not free-tier users on Instant.
The per-category breakdown for 5.5 Instant shows how stark the brand drop is:
Category
First-party rate
Ecommerce
0%
Services
0%
Trends
0%
Travel
0%
Food
0%
Fitness
0%
Comparison
0%
Healthcare
5%
Legal
7%
SaaS
7%
Home
8%
Productivity
9%
Shopping
10%
Finance
18%
Marketing
18%
For most prompt categories, the model cites zero brand websites. Only Finance and Marketing reach 18%. Compare this to GPT-5.5 Thinking on the same 50 prompts, where the average is 47% first-party and most categories see brand citations regularly.
The brand visibility your customers had on the free tier under GPT-5.3 was already small. On GPT-5.5 Instant, in most categories, it’s gone.
Reddit is now ChatGPT’s most-cited domain on Instant
This is the single most striking pattern in the data. Reddit is GPT-5.5 Instant’s most-cited domain by a wide margin.
Reddit citations on Instant grew 6x from one model version to the next. From a peripheral source on 5.3 (cited 6 times, ranked 9th) to dominant on 5.5 (38 citations, more than 3x the next-most-cited domain).
Forbes, the most-cited domain on 5.3 Instant by a large margin (21 citations), got knocked out of the top three entirely. The shift isn’t subtle.
For brands relying on user-generated content visibility, this is the single most consequential change in the Instant tier. Reddit threads, including older ones, comparison threads, “best X for Y” recommendation threads, are now disproportionately surfaced when free-tier users ask ChatGPT for product or service recommendations.
If your category has active Reddit threads, those threads are now your free-tier visibility surface.
8 of 50 prompts auto-escalate from Instant to Thinking, even with the toggle off
The most operationally important finding in the dataset: ChatGPT’s Instant tier silently routes complex prompts to the Thinking tier, regardless of user settings.
Of our 50 prompts, 9 conversations came back with both gpt-5-5 (Instant) and gpt-5-5-thinking slugs in the conversation payload. The model started the response on Instant and was rerouted to Thinking partway through.
We re-ran those 9 prompts in fresh sessions with Auto-switch to Thinking explicitly disabled. 8 of 9 escalated again. Routing is content-based, not random.
The 8 prompts that re-escalated:
“What are the biggest trends in ecommerce for 2026?”
“How is AI changing the recruiting and hiring process?”
“Best online learning platforms for professional development in 2026”
“Compare Coursera vs Udemy vs LinkedIn Learning for tech skills”
“Best coding bootcamps for career changers in 2026”
“Notion vs Obsidian vs Roam Research for personal knowledge management”
“What are the top cybersecurity threats businesses should prepare for in 2026?”
“How is AI changing the legal industry in 2026?”
The pattern is clear. Broad-recommendation prompts, multi-vendor research prompts, and open-ended trend prompts get classified as “too complex” for Instant and silently rerouted. Tighter head-to-head comparisons (e.g. “Compare HubSpot vs Salesforce vs Pipedrive”, which has multiple vendors but a tight question) often stay on Instant.
For brands auditing ChatGPT visibility, this matters in a specific way. Instant-tier traffic on broad recommendation prompts isn’t actually Instant-tier behavior. It’s Thinking-tier behavior delivered to a free user. The response will look like Thinking output (more brand sites, more site: queries, ~85% Google-disconnected) even though the user picked Instant.
If your client’s measured ChatGPT visibility is bouncy on certain prompt categories, this is likely the cause. Check the conversation’s model_slug field. It tells you which model actually answered.
How GPT-5.5 Instant differs from GPT-5.5 Thinking
GPT-5.5 Instant and GPT-5.5 Thinking share the same underlying model generation and the same release timeline. They behave very differently.
Metric
5.5 Instant
5.5 Thinking
Δ
First-party citation %
6.0%
47.2%
+41 pp
Avg fan-out queries
1.0
7.3
+7.3x
site: operator usage in fan-outs
0%
12.6%
n/a
Avg web results read
16.1
102.7
+6.4x
Avg citations in final answer
5.6
7.2
+29%
Pricing-page citations (% of total)
0%
8.8%
n/a
Cited domains in Google top 10
27%
16%
−11 pp
Median cited-page age (days)
n/a
88
n/a
The Thinking tier issues 7x more search queries, reaches 6x more pages, cites brand sites 8x more often, and reaches into pricing pages on ~9% of citations. The Instant tier is faster, leaner, third-party-heavy, and doesn’t expose publication-date metadata at all.
Citation domain overlap between 5.5 Instant and 5.5 Thinking is just 8.7%. Even on the same 50 prompts, the two tiers pick almost entirely different sources.
In other words: a user on the free tier and a user on Plus, asking the same question, see categorically different answers backed by categorically different domains.
How Instant compares to Google rankings
We cross-referenced 30 prompts against Google US and Bing US top-10 organic results via SerpAPI:
Model
Cited domain × prompt pairs
In Google top 10
In Bing top 10
Absent from both
5.5 Instant
89
27%
4%
72%
5.5 Thinking
140
16%
3%
84%
5.3 Instant
179
30%
7%
69%
5.4 Thinking
143
13%
2%
87%
GPT-5.5 Instant tracks closer to Google’s index than GPT-5.5 Thinking does, by 11 percentage points. Both Instant tiers (5.3 and 5.5) hover around 70-72% absent from Google. Both Thinking tiers hover around 84-87%.
For brands that historically optimized for Google rank, this is the version of ChatGPT where SEO investment most clearly carries over. A meaningful share of cited domains on Instant do appear in Google’s top 10 for the same query. The Thinking tier is much further removed from search-engine consensus.
If you’re already investing in Google SEO, that work is partially paying off on the free-tier ChatGPT experience. It’s not paying off on the Plus tier.
What this means for brands
1. The free-tier ChatGPT experience is now Reddit-heavy. If your category has active Reddit threads, positive, negative, or mixed, they’re being surfaced to free-tier users disproportionately. Audit your Reddit footprint this week. Consider Reddit-native presence if your category lives there. We wrote a full Reddit playbook for AI visibility, and the strategies in there matter even more now.
2. Pricing pages get zero free-tier visibility, regardless of version. Both Instant tiers cite zero pricing pages across all conversations. If pricing-page traffic from ChatGPT matters to you, it’s coming from Plus and Pro users on Thinking only.
3. Brand visibility on Instant has halved between versions. Pages of yours that used to land in 5.3 Instant answers (already rare) are now even rarer. The Instant tier has moved toward third-party content sources at almost every category.
4. SEO investment still helps on Instant, partially. 27% of GPT-5.5 Instant’s cited domains appear in Google’s top 10 for the same query, vs 16% for GPT-5.5 Thinking. Traditional ranking still pulls some weight on the free tier in a way it doesn’t on Plus.
5. Watch for silent escalation. ~16% of broad-recommendation prompts on Instant get rerouted to Thinking. If a client’s measured ChatGPT visibility is bouncy on certain prompt categories, content-based escalation may be the cause. The conversation’s model_slug field tells you which model actually answered.
6. Search runs nearly 100% of the time on Instant. GPT-5.5 Instant ran web search on every single one of its 42 conversations. The only model in the study where the no-search behavior is fully extinct. Assume any prompt in your category will trigger search on the free tier.
We built free-tier vs paid-tier ChatGPT visibility tracking into our AI visibility platform. Track citation share by model tier, monitor the Reddit threads cited for your category, and see which queries on Instant get silently rerouted to Thinking. See it in action →
How to verify which model your conversation actually used
You can audit any single ChatGPT conversation directly from the browser. After sending a prompt:
Step 1: Open the console
Cmd + Option + J on Mac. Ctrl + Shift + J on Windows. Switch to the Console tab.
Step 2: Paste this script
const cid = location.pathname.split("/c/")[1];
const session = await (await fetch("/api/auth/session")).json();
const r = await fetch(`/backend-api/conversation/${cid}`, {
headers: { Authorization: `Bearer ${session.accessToken}` },
});
const data = await r.json();
const slugs = new Set();
for (const node of Object.values(data.mapping)) {
const s = node?.message?.metadata?.model_slug;
if (s) slugs.add(s);
}
console.log("Model slug(s):", [...slugs]);
What you’ll see
If you see only gpt-5-5, you got pure Instant. If you see both gpt-5-5 and gpt-5-5-thinking in the same conversation, your conversation auto-escalated to Thinking partway through. The user-facing UI doesn’t tell you when this happens. The payload does.
Questions we’re still investigating
Run-to-run variability. ChatGPT is non-deterministic. Single-run measurements like this give directional reads, not statistical certainty. The 8 escalated prompts had a dedicated re-run; the 42 Instant conversations did not. We’re planning a multi-run pass on the Instant set.
Reddit thread selection logic. Of the 38 Reddit citations, which threads got picked? Is it engagement-weighted? Recency-weighted? Topic-relevance-weighted? Reading those 38 conversations side-by-side to characterize the selection logic would clarify what brands should optimize for.
The escalation gradient. ~16% of prompts re-escalated cleanly. Is there a fuzzy middle category (prompts that escalate sometimes and not others)? A multi-run pass against the same 50 prompts would tell us.
International tier behavior. This study ran from a US ChatGPT account. Whether GPT-5.5 Instant behaves the same in non-US markets, or whether OpenAI A/B tests different routing logic by region, is open.
Methodology snapshot
50 prompts spanning 16 categories, derived from a representative cross-section of consumer and B2B research queries. Run on May 6, 2026 from a single ChatGPT free-tier account in the United States. 42 conversations ran cleanly on Instant. 8 auto-escalated to Thinking and were re-run in fresh sessions with the Auto-switch toggle disabled. 7 of 8 escalated again.
Conversation payloads pulled directly from ChatGPT’s /backend-api/conversation/<id> endpoint with browser-session authentication. Every payload includes the full message tree, all search_model_queries (fan-outs), all search_result_groups (web results), all content_references (citations), and model_slug per message.
Citation classification: Claude Haiku 4.5 with detailed instructions and 50+ in-prompt examples. Each cited URL classified independently as FIRST, THIRD, or UNCLEAR. Calibration check against our broader 5.5 vs 5.4 vs 5.3 study, where 5.4 Thinking’s first-party rate measured 56.8%, in line with previously observed benchmarks.
Search-engine cross-reference: 30 prompts × Google US and Bing US top-10 organic results via SerpAPI.
Limitations: single user account, single run per prompt (8 escalated prompts had a dedicated re-run), single point in time. Repeat runs would produce slightly different results due to ChatGPT’s non-determinism. The 5.5 Instant sample is 42 conversations rather than 50 because of the auto-escalation pattern.
TLDR
GPT-5.5 Instant cites brand websites 6% of the time. About half as often as GPT-5.3 Instant did (13%). For most prompt categories, brand citations are 0%.
Reddit went from peripheral to dominant. Reddit citations grew 6x from 5.3 Instant to 5.5 Instant. It’s now the single most-cited domain by a 3x margin.
Pricing pages: zero, both Instant versions. The Instant search infrastructure does not reach into brand pricing structures, regardless of model version.
8 of 50 broad-recommendation prompts auto-escalate to GPT-5.5 Thinking even with the Auto-switch toggle off. Routing is content-based and persisted on retry.
GPT-5.5 Instant tracks Google’s index more closely than GPT-5.5 Thinking does. 27% of cited domains in Google top 10 vs 16%. SEO investment still carries some weight on the free tier.
Citation overlap between 5.5 Instant and 5.5 Thinking is only 8.7%. Same generation, same prompt, almost entirely different cited sources.
Search runs 100% of the time on 5.5 Instant. The only model in the study where no-search responses are extinct.
For brands: audit your Reddit footprint, accept zero pricing-page visibility on Instant, segment ChatGPT measurement by tier, and check model_slug to know which model actually answered.
If your AEO measurement isn’t segmented by tier, you’re averaging two fundamentally different surfaces. The free-tier experience your customers see most often is now Reddit-heavy, brand-light, and not necessarily on the model the user thinks they’re using. Start with a free Writesonic account or book a demo to track free-tier and paid-tier ChatGPT visibility separately in your dashboard.
ChatGPT just rolled out two new models. GPT-5.3 Instant is the new default. GPT-5.4 Thinking is the new premium.
I wanted to know: do they search the web differently? Do they cite different sources? And what does that mean for brands trying to show up in AI search?
To find out, I tested 50 prompts across both models, extracted every fan-out query they sent, and classified every citation they returned.
Here’s the short version: GPT-5.3 sends users to blog posts about your brand. GPT-5.4 sends them to your actual website. Same question. Completely different outcomes.
Here’s the long version.
How we did this
We ran 50 prompts on ChatGPT across GPT-5.3 Instant (the new default), GPT-5.4 Thinking (the new premium), and GPT-5.2 Instant and GPT-5.2 Thinking as baselines. That gave us 119 total conversations.
After each response, we extracted the full conversation JSON using ChatGPT’s internal API. This exposed every fan-out query the model sent, every web search result it received, and every citation URL it included in its answer.
We also ran 30 of these queries through both Bing and Google via SerpAPI to compare ChatGPT’s results against traditional search engines.
For each product or service prompt, we classified citations as “first-party” (the actual brand’s website, like hubspot.com for HubSpot) or “third-party” (review sites, blogs, Reddit, media outlets).
Now, here’s what we found.
GPT-5.3 and GPT-5.4 cite completely different sources
This is the headline finding.
56% of GPT-5.4’s citations go to brand websites. Only 8% of GPT-5.3’s do.
And here’s the part nobody expected: GPT-5.3 is worse for brands than GPT-5.2 was. The previous default cited brand websites 22% of the time. The new default dropped to 8%.
Put another way: the model most ChatGPT users interact with now sends 92% of citation traffic to third-party sites.
The pattern holds across almost every prompt
This isn’t a statistical edge case. Look at what happens when you ask both models the same question:
On comparison prompts (“X vs Y vs Z”), GPT-5.3 never cited a single brand. GPT-5.4 cited brands 83-100% of the time.
The first-party gap varies by category
Head-to-head comparisons show the biggest gap: 0% on GPT-5.3 vs 83% on GPT-5.4. SaaS sees a 7x improvement (12% to 82%). Even shopping, where GPT-5.4 is least brand-forward, still doubles the first-party rate.
And the models cite almost none of the same sources
For the same prompt, GPT-5.3 and GPT-5.4 cite completely different websites.
Average citation overlap across all 50 prompts: 7%.
On 22 of 50 prompts, the overlap was exactly 0%. Being visible on GPT-5.3 gives you no advantage on GPT-5.4.
This has massive implications for GEO and AEO strategy. A brand that dominates on GPT-5.3 might be invisible on GPT-5.4, and vice versa. Any AI visibility audit that only tests one model misses the picture entirely.
The “kingmaker” sites on GPT-5.3
Because GPT-5.3 cites third-party sites almost exclusively, a small number of review and media domains become gatekeepers:
GPT-5.4 sends 8.5x more fan-out queries than GPT-5.3
The search architecture between these models is fundamentally different.
GPT-5.3 sends one query: the raw user prompt. GPT-5.4 decomposes it into 8.5 sub-queries on average, with domain restrictions and site: operators.
Here’s the full funnel:
Model
Avg queries
Avg web results
Avg citations
Avg response length
GPT-5.2 Instant
0.9
36.6
4.5
388 words
GPT-5.3 Instant
1.0
27.3
5.8
548 words
GPT-5.4 Thinking
8.5
109.4
14.8
769 words
GPT-5.4 also uses two features no other model uses: domain-restricted queries (148 total) and site: operators (156 total). Combined, that’s 304 targeted queries across 50 prompts.
“The Verge review iPhone Samsung Pixel” → [theverge.com]
This is why GPT-5.4’s first-party citation rate is 56%. It goes to brand sites first, validates second.
How much research does each category get?
Some categories trigger far more queries and citations than others on GPT-5.4:
Category
GPT-5.3 queries
GPT-5.4 queries
GPT-5.3 cited
GPT-5.4 cited
GPT-5.4 web results
Productivity
1.0
14.7
8.3
20.3
156
Marketing
1.0
11.7
6.3
25.0
144
Legal
1.0
12.5
8.0
15.0
165
Services
1.0
14.0
3.5
15.0
184
Travel
1.0
11.7
8.7
12.7
148
Education
1.0
10.0
6.0
17.7
130
Finance
1.0
8.3
6.0
17.7
130
SaaS
1.0
6.3
3.7
17.3
76
Comparison
1.0
9.3
6.3
14.3
99
Shopping
1.0
4.6
3.8
8.6
56
Fitness
1.0
4.0
4.7
10.7
64
B2B software categories (Productivity, Marketing, Legal) trigger the most queries on GPT-5.4. Consumer product categories (Fitness, Shopping) trigger fewer. This likely reflects the complexity of B2B purchasing decisions.
Same search index, different query strategy
Are GPT-5.3 and GPT-5.4 searching different web indexes? Or the same one?
The data points to the same index.
Metric
GPT-5.3 Instant
GPT-5.4 Thinking
Avg queries per prompt
1.0
8.5
Avg web results per prompt
27.3
109.4
Web results per query
27.3
12.9
GPT-5.3 sends one broad query and gets ~27 results. GPT-5.4 sends 8.5 specific queries and gets ~13 results per query.
The per-query result count is lower for GPT-5.4 because its queries are more targeted. But the total result pool is 4x larger because it sends 8.5x more queries.
Bottom line? Same index, different decomposition. The fan-out strategy IS the difference.
GPT-5.4’s site: operator changes the game for AEO
GPT-5.4 sent 156 queries with site: operators across 50 prompts. No other model used site: at all.
Here’s how all 423 queries break down:
Query type
Count
% of total
Purpose
Domain-restricted (brand sites)
142
34%
“Get pricing and features from this brand’s website”
1. GPT-5.4 pre-selects which brands to investigate. Before sending any query, GPT-5.4 decides which brands are relevant based on its training data. If your brand isn’t in the consideration set, no amount of SEO will help.
2. Your G2 and Capterra presence feeds GPT-5.4 directly. G2 (8 queries) and Capterra (6 queries) are top validation targets. Strong profiles translate directly to AEO visibility.
3. site: queries create a verification loop. GPT-5.4’s process: identify brands from training data, check brand websites directly, validate on review platforms. Brands need coverage across all three layers.
GPT-5.4 cites pricing pages 35x more than GPT-5.3
Different models don’t just cite different sources. They cite different page types.
Page type
GPT-5.3 Instant
GPT-5.4 Thinking
Pricing pages
4 (1%)
138 (19%)
Blog/article pages
92 (32%)
61 (8%)
Homepage/root pages
42 (15%)
161 (22%)
Product/feature pages
13 (5%)
73 (10%)
GPT-5.3 is a “blog reader.” 92 of its 284 citations (32%) point to blog posts and articles.
GPT-5.4 is a “pricing page checker.” 138 of its 739 citations (19%) point to pricing pages, 161 (22%) to homepages, 73 (10%) to product pages. Combined, 51% of GPT-5.4’s citations land on commercial pages.
4 pricing page citations on GPT-5.3 across 49 conversations. 138 on GPT-5.4 across 50. That’s 35x.
If your pricing page shows “contact sales” instead of actual numbers, GPT-5.4 will find the problem.
Google rankings predict GPT-5.3 citations. GPT-5.4 bypasses rankings entirely.
Does ranking on Google or Bing help you get cited by ChatGPT?
Depends on the model.
We took 94 domains that GPT-5.3 cited across 9 prompts and checked whether each one also appeared in Bing or Google results for the same query (via SerpAPI).
47% of GPT-5.3’s citations come from domains that also rank on Google. Only 27% from domains on Bing.
But 44% don’t appear on either search engine for the same query. ChatGPT has its own retrieval layer.
GPT-5.4 is a completely different story
We did the same analysis for GPT-5.4. The results were striking.
75% of GPT-5.4’s cited domains don’t appear in Bing OR Google results for the same user prompt.
Why? Because GPT-5.4 doesn’t find brands through traditional search. It knows them from training data, then sends domain-restricted queries directly to their websites.
When you ask about running shoes, GPT-5.4 doesn’t search “best marathon running shoes” and hope nike.com ranks. It searches “[Nike Pegasus vs ASICS Gel Nimbus vs Brooks Ghost 2026]” restricted to nike.com, asics.com, etc.
Prompt
GPT-5.4 cited domains
On Bing/Google
NOT on Bing/Google
A2: Shopify vs WooCommerce
5
0 (0%)
5 (100%)
B2: Running shoes
8
2 (25%)
6 (75%)
C1: Marketing agencies
6
0 (0%)
6 (100%)
Bottom line? For GPT-5.3, invest in SEO (especially Google). For GPT-5.4, invest in brand recognition and first-party content quality. Search rankings don’t get you into GPT-5.4’s citation set.
GPT-5.4 makes AI search attribution trackable for brands
Every cited URL gets ?utm_source=chatgpt.com appended. Combine that with the first-party citation rate and you get something interesting:
Model
First-party rate
UTM coverage
Trackable brand traffic
GPT-5.2 Instant
22%
60%
~13% of citations
GPT-5.3 Instant
8%
96%
~8% of citations
GPT-5.4 Thinking
56%
87%
~49% of citations
On GPT-5.3, the brand gets mentioned in the answer, but 92% of clicks go to Forbes, TechRadar, and Reddit. The brand gets the recommendation. Someone else gets the traffic.
On GPT-5.4, nearly half of all citation traffic goes to the brand’s own website with UTM tracking. The brand gets the recommendation AND the trackable visit.
This is the biggest attribution shift in GEO/AEO. For the first time, a thinking model makes AI search attribution comparable to paid search: the user clicks to your site, you track it in GA4.
Set up a segment for utm_source=chatgpt.com now. As GPT-5.4 adoption grows, you’ll see this traffic appear.
Some prompts don’t trigger web search at all
Before worrying about citations, worry about whether the model even searches.
Model
Prompts that didn’t search
GPT-5.2 Instant
1/10 (AI recruiting)
GPT-5.3 Instant
1/49 (AI recruiting)
GPT-5.4 Thinking
4/50 (AI recruiting, robot vacuums, standing desk deals, gift ideas)
Paradoxically, the model that searches deepest when it does search also skips more prompts entirely. GPT-5.4 skipped two shopping prompts (“Best deals on standing desks this week” and “I need to buy a gift for my wife under $100”).
But here’s what’s interesting: GPT-5.4 still cited sources when it didn’t search. The robot vacuum prompt had 17 citations from training data alone. GPT-5.3 produced zero citations when it didn’t search.
Prompts with a specific year (“in 2026”), price constraints (“under $500”), or comparison structure (“X vs Y”) triggered search 100% of the time on both models.
Shopping intent behaves differently on GPT-5.4
We tested 5 explicit shopping prompts (“I want to buy…”, “Where can I buy…”, “Best deals on…”). The results surprised us.
Prompt
GPT-5.3 searched?
GPT-5.4 searched?
GPT-5.3 citations
GPT-5.4 citations
Buy earbuds under $150 for running
Yes
Yes
2
11
Cheapest MacBook Air M4
Yes
Yes
5
9
Best deals on standing desks
Yes
No
4
15 (from memory)
Gift for wife under $100
Yes
No
4
2 (from memory)
Best rated espresso machine under $500
Yes
Yes
4
6
GPT-5.3 searched for all 5 shopping prompts. GPT-5.4 skipped 2 of them.
GPT-5.4 treated “deals” and “gift” prompts as knowledge tasks, not search tasks. It answered from training data. This means time-sensitive shopping queries may not trigger web search on the thinking model.
For ecommerce brands: your deal pages and gift guides may get more visibility on GPT-5.3 than GPT-5.4. But when GPT-5.4 does search for products, it cites your product pages directly (soundcore.com, breville.com) while GPT-5.3 cites review sites (reddit.com, steamritual.com).
GPT-5.3 surfaces older content than the previous default
Model
% under 30 days old
% under 90 days old
GPT-5.2 Instant
33%
52%
GPT-5.3 Instant
6%
27%
GPT-5.4 Thinking
18%
37%
GPT-5.3 retrieves dramatically less fresh content. Only 6% of its web search results are under 30 days old, compared to 33% on the previous GPT-5.2.
“Just publish more content” isn’t a winning AEO strategy for the new models. Comprehensiveness and quality matter more than recency.
How to extract fan-out queries from any ChatGPT conversation
You can see exactly what queries ChatGPT sends and which domains it cites. Here’s how.
Step 1: Have a ChatGPT conversation
Ask any question that triggers web search. Product comparisons, “best X” queries, anything with a year.
Step 2: Open the console
Mac: Cmd + Option + J
Windows: Ctrl + Shift + J
Step 3: Paste this script
(async () => {
const a = await fetch('/api/auth/session', { credentials: 'include' });
const b = await a.json();
const cid = window.location.pathname.split('/c/')[1];
const d = await fetch('/backend-api/conversation/' + cid, {
credentials: 'include',
headers: { 'Authorization': 'Bearer ' + b.accessToken }
});
const e = await d.json();
let queries = [], cited = 0, utmCount = 0, totalUrls = 0;
const domains = [];
for (const node of Object.values(e.mapping || {})) {
const m = node.message;
if (!m) continue;
if (m.content?.content_type === 'code' && m.content?.text) {
try {
const p = JSON.parse(m.content.text);
if (p.search_query) p.search_query.forEach(sq =>
queries.push({ q: sq.q, domains: sq.domains || [] })
);
} catch(err) {
const match = m.content.text.match(/search\("([^"]+)"\)/);
if (match) queries.push({ q: match[1], domains: [] });
}
}
if (m.metadata?.content_references) {
for (const ref of m.metadata.content_references) {
if (ref.items) ref.items.forEach(i => {
cited++; totalUrls++;
if (i.url?.includes('utm_source=chatgpt')) utmCount++;
try { domains.push(new URL(i.url).hostname.replace('www.','')); } catch(e){}
});
}
}
}
console.log('Model:', e.default_model_slug);
console.log('Fan-out queries:', queries.length);
queries.forEach((q, i) =>
console.log( ${i+1}. ${q.q}${q.domains.length ? ' [' + q.domains.join(', ') + ']' : ''})
);
console.log('Cited sources:', cited);
console.log('Cited domains:', [...new Set(domains)].join(', '));
console.log('UTM coverage:', utmCount + '/' + totalUrls);
})();
What to look for
On GPT-5.3: You'll see 1 query (the raw prompt) and 3-8 cited domains, mostly third-party review sites.
On GPT-5.4: You'll see 4-20 queries with domain restrictions in brackets and site: operators. Cited domains will be a mix of brand sites and review platforms.
The GPT-5.2 to GPT-5.3 shift looks incremental on the surface. Same query count. Similar citations. But GPT-5.3 is worse for brands (8% vs 22% first-party), worse for freshness (6% vs 33% under 30 days), and more blog-dependent (32% of citations are blog posts).
The GPT-5.2 to GPT-5.4 shift is structural. Domain-targeted queries. First-party-dominant citations. Pricing page reading. Multi-phase research. Everything about how the model searches changed.
What this means for brands
1. Audit your pricing page first. GPT-5.4 cited 138 pricing pages across 50 prompts. It checks for actual numbers. "Contact sales" pages get skipped.
2. Build third-party coverage for GPT-5.3. The kingmaker sites: Forbes (15 citations), TechRadar (10), Tom's Guide (10), Reddit (7). If these sites don't mention you, GPT-5.3 won't either.
3. Your G2 and Capterra profiles matter for AEO. GPT-5.4 validates brands against these platforms. Weak profiles mean weaker citations.
4. Set up GA4 attribution now. Create a segment for utm_source=chatgpt.com. Coverage is 87-96% across new models.
5. Test both models. GPT-5.3 visibility and GPT-5.4 visibility are different things with 7% overlap. You need both.
6. Google rankings predict GPT-5.3 citations better than Bing. 47% of GPT-5.3's citations come from Google-ranked domains, 27% from Bing. For GPT-5.4, rankings don't matter much: 75% of cited domains aren't on either engine.
What this means for agencies
1. Build model-level reporting. "Your client is cited 40% of the time" is incomplete. Report GPT-5.3 visibility (third-party-mediated) and GPT-5.4 visibility (first-party-driven) separately.
2. Run a two-track GEO/AEO service. Track 1: third-party distribution for GPT-5.3 users. Track 2: first-party content optimization for GPT-5.4 users.
3. Search rankings alone don't predict AI visibility. 44% of GPT-5.3's citations come from domains not on Google or Bing. For GPT-5.4, that number is 75%.
Questions we're still investigating
What determines GPT-5.4's brand list? It pre-selects which brands to search before sending any query. Training data? Market share? We don't know yet.
What's the 44% that appears on neither Google nor Bing? Nearly half of GPT-5.3's citations don't rank on either search engine for the same query. OpenAI has a retrieval mechanism beyond traditional search.
Do multi-turn conversations change the pattern? All our prompts were single-turn. Follow-up questions might shift citation behavior.
Methodology
Scope: 119 conversations onChatGPT, March 7-8, 2026.
Prompts: 50 unique prompts. All 50 tested on GPT-5.3 Instant and GPT-5.4 Thinking. 10 prompts also on GPT-5.2 Instant and GPT-5.2 Thinking.
SerpAPI: 30 queries through Bing US and Google US. For GPT-5.3, we mapped ChatGPT's cited domains against both engines. For GPT-5.4, we compared cited domains against Bing/Google results for the raw user prompt.
Classification: Citations to brand-related domains classified as "first-party." All others as "third-party." Page types classified by URL path. Freshness measured from publication date metadata.
Limitations: Single user account. ChatGPT is non-deterministic. Repeat runs may vary. The India-based account may have affected some results (amazon.in appearing in citations). GPT-5.2 data is from 10 prompts only.
TLDR
GPT-5.4 cites brand websites 7x more than GPT-5.3 (56% vs 8%). It does this by decomposing prompts into 8.5 fan-out queries with domain restrictions and site: operators. The two models cite completely different sources (7% overlap).
For brands: fix your pricing page, build G2/Capterra profiles, and get third-party coverage on Forbes/TechRadar for GPT-5.3 users. For agencies: report visibility per model. They measure different things.
We built the same analysis pipeline we used for this study into Writesonic. Track your citation share, monitor fan-out queries, and see which models cite your brand, all in one dashboard. See it in action →
Appendix: all 50 prompts
ID
Category
Prompt
A1
SaaS
What's the best CRM for a 50-person B2B SaaS company?
A2
SaaS
Compare Shopify vs WooCommerce vs BigCommerce for a DTC brand doing $5M in revenue
A3
SaaS
Best project management tools for remote engineering teams in 2026
B1
Ecommerce
Best noise cancelling headphones under $300 for working from home
B2
Ecommerce
What running shoes do marathon runners recommend in 2026?
B3
Ecommerce
Best organic skincare brands for sensitive skin
C1
Services
Best digital marketing agencies for ecommerce brands in the US
C2
Services
Top accounting software for small businesses with under 20 employees
D1
Trends
What are the biggest trends in ecommerce for 2026?
D2
Trends
How is AI changing the recruiting and hiring process?
E1
Healthcare
Best telehealth platforms for small medical practices in 2026
E2
Healthcare
What supplements do doctors recommend for sleep in 2026?
E3
Healthcare
Best EHR software for independent physicians in 2026
F1
Finance
Best business credit cards for startups with no revenue history
F2
Finance
Compare QuickBooks vs Xero vs FreshBooks for freelancers
F3
Finance
Best payroll software for small businesses with under 50 employees in 2026
G1
Travel
Best travel insurance companies for international trips in 2026
G2
Travel
Top hotel booking sites with the best price guarantees
G3
Travel
Best carry-on luggage brands for frequent business travelers
H1
Education
Best online learning platforms for professional development in 2026
H2
Education
Compare Coursera vs Udemy vs LinkedIn Learning for tech skills
H3
Education
Best coding bootcamps for career changers in 2026
I1
Home
Best smart home security systems under $500 in 2026
I2
Home
Top robot vacuums for pet owners in 2026
I3
Home
Best air purifiers for allergies recommended by doctors
J1
Food
Best meal delivery services for families in 2026
J2
Food
Top rated coffee subscription services
J3
Food
Best protein powder brands for muscle building in 2026
K1
Legal
Best contract management software for small businesses
K2
Legal
Top legal document automation tools in 2026
L1
Marketing
Best email marketing platforms for ecommerce brands in 2026
L2
Marketing
Compare HubSpot vs Salesforce vs Pipedrive for sales teams under 20 people
L3
Marketing
Best SEO tools for small business websites in 2026
M1
Productivity
Best AI writing tools for content marketers in 2026
M2
Productivity
Top password managers for small business teams
M3
Productivity
Best video conferencing software for remote teams in 2026
N1
Fitness
Best fitness trackers for marathon training in 2026
N2
Fitness
Top rated yoga mats for home practice
N3
Fitness
Best home gym equipment under $1000 in 2026
S1
Shopping
I want to buy wireless earbuds under $150 for running, what should I get?
S2
Shopping
Where can I buy the cheapest MacBook Air M4 right now?
S3
Shopping
Best deals on standing desks this week
S4
Shopping
I need to buy a gift for my wife under $100, what are good options?
S5
Shopping
Buy the best rated espresso machine under $500
V1
Comparison
Notion vs Obsidian vs Roam Research for personal knowledge management
V2
Comparison
iPhone 17 Pro vs Samsung Galaxy S26 Ultra vs Google Pixel 10 Pro
V3
Comparison
Tesla Model 3 vs BMW i4 vs Polestar 2 for daily commuting in 2026
T1
Trends
What are the top cybersecurity threats businesses should prepare for in 2026?
T2
Trends
How is AI changing the legal industry in 2026?
T3
Trends
What are the biggest challenges for DTC brands in 2026?
Key Takeaways:
GEO is an extension of work your SEO, content, and PR teams are already doing. The skills exist, but the workflow probably doesn’t.
Your content team already writes for featured snippets, your technical SEO foundation already makes your site parseable and your PR team already manages third-party presence. Point these at AI platforms and you’re 80% there.
What’s missing for a lot of teams is the tooling, a lightweight experiment tracking process, ownership, and permission to deprioritize something else so GEO gets real focus.
Hire when you’re at enterprise scale with thousands of prompts to manage, when GEO is already a major channel in your category, or when your team genuinely has no bandwidth. Otherwise, look to structure.
Most marketing teams follow a similar pattern when a new channel becomes popular. First, they see industry folks talk about it on places like LinkedIn. Maybe they even notice the channel driving a few initial sales.
Then, a discussion happens, and someone almost always asks the question:
“Do we need to hire someone to manage this?”
There’s a similar pattern taking root inside a lot of marketing teams thanks to the rather bombastic arrival of GEO (Generative Engine Optimization) on the scene.
But what if we told you your existing team has almost everything you need to succeed with GEO?
Those who are in charge of SEO, content, and PR already have the skills they need to handle GEO. It’s just a matter of setting up a proper workflow, a sturdy tech stack and a measurement process.
So, before you go ahead and post a job description for a GEO specialist, ask yourself this:
“Do we lack the capability to do GEO? Or do we just lack structure?”
We’re willing to bet it’ll be the latter.
GEO Is Additive SEO, Not a Reinvention of It
There’s a popular narrative (often spread by self-proclaimed AI search gurus) that generative engine optimization is a completely new discipline that requires a specialized team. It’s not.
GEO is an extension of SEO. Your team is already improving content structure, strengthening authority signals, updating content regularly. Those same practices drive AI citations. The content just gets surfaced in AI answers instead of search results.
A few things are different, though. AI platforms have a different way of citing information; they rely more on structured data and third-party sources, and they require content to be highly specific.
But the instances where these differences justify a new role are few and far between.
What You Already Have
If you care to look closely, most of the GEO skill set already lives inside your team.
Creating Effective Content
Your content team or specialist already knows how to make information clear, factual, and structured.
They’re writing authoritative and scannable content, optimizing for featured snippets, and using optimized headings and structure.
These are the same qualities large language models (LLMs) look for when deciding which pages to surface. They reward clear, factual answers and rely heavily on authority signals and content structure. (If you’ve been paying attention to featured snippet optimization over the past few years, congratulations—you’ve been doing proto-GEO without knowing it.)
So if your team knows how to write a good “What is X” section for an SEO article, they already know how to write citable definitions. The adjustment is minor: a bit more specificity, a bit more concision, a bit less fluff. Less “our industry-leading solution transforms workflows” and more “this tool does X, Y, and Z.”
Technical SEO
Your technical foundation is pulling double duty as you read this.
Site architecture, internal linking and schema markup (aka the essential elements of technical SEO) are what influences how LLMs process and attribute information. They use schema to extract structured data and they rely on logical site hierarchy and clean HTML to discover and parse content.
The good news is that most of this work compounds. If you’ve invested in technical SEO over the past few years, you’re not starting from zero with GEO. You’re starting from a foundation that AI platforms already know how to read. The teams that neglected technical SEO in favor of content volume are the ones going in circles now.
Competitive Intelligence and Visibility Tracking
You know that thing where you’re staring at a SERP, trying to figure out why that one competitor keeps outranking you despite having content that’s objectively worse? The “how is this page ranking” spiral that every SEO person has fallen into at least once?
GEO has its own version of that. Except instead of rankings, you’re looking at citations. Instead of “why do they rank,” you’re asking “why did ChatGPT cite them and not us.” The existential frustration is the same, it’s just the surface that’s different.
Your team is already tracking keyword rankings, analyzing SERPs, identifying content gaps. GEO requires all of that, but pointed at AI platforms instead of search engines. Which prompts are you showing up for? Which ones are you invisible on? Why does your competitor keep getting cited in answers about your category when your product page is objectively more helpful?
The analytical muscle is identical. The tooling is still catching up, honestly. We’ve had keyword tracking infrastructure for decades and AI visibility tracking is maybe two years old. But if your in-house SEO ever obsessed over why a competitor ranks, they already know how to obsess over why they get cited.
Keyword Research
Speaking of your SEOer, they have a process for keyword research. They know how to pull data, prioritize by intent, group related queries, build a tracking system. That entire workflow transfers to GEO with one adjustment: the inputs.
In SEO, keyword research usually starts with tools. In GEO, prompt research often starts with your support tickets, sales calls, customer interviews. The questions people ask AI platforms aren’t always the same ones they type into Google. They’re often longer, more conversational, more specific. “What’s the best CRM for a 20-person sales team that already uses HubSpot for marketing” isn’t a keyword anyone’s bidding on, but it’s absolutely a prompt someone’s typing into ChatGPT.
The strategic logic is similar: prioritize commercial intent, group related prompts, track visibility. Your team already knows how to do this. They just need access to different source material.
Third-Party Visibility and Brand Management
This is the one that catches people off guard.
Brand management has always mattered in SEO. Backlinks, mentions, reviews…none of this is new per se. But in SEO, you had a fallback. Even if your third-party presence was weak, you could still rank. You control your site, optimize your pages, fix your technical issues, and sometimes manage to brute-force your way up the SERP through sheer on-page effort.
GEO doesn’t give you that fallback.
AI platforms pull from everywhere, and we do mean everywhere. Your site, yes, but also review sites, Reddit threads, comparison articles, news coverage, Quora answers, analyst reports. They’re triangulating across sources to build an answer. And you can’t control most of those sources. Your owned channels can be perfect—structured data in place, content optimized, everything technically sound—and you’ll still get passed over if the third-party ecosystem doesn’t back you up.
It’s not that brand management suddenly became important, it’s that you can no longer compensate for weak third-party presence by being really good at the stuff you control.
Your PR team already knows how to work this ecosystem. They’re tracking mentions, building journalist relationships, monitoring review sites. The work is the same, but the margin for error just got smaller.
What’s Missing
So your team has the skills. What they probably don’t have is the infrastructure.
Let’s break it down.
The Tech Stack
Your SEO team lives in Google Search Console and GA4, tools that have been refined over the course of fifteen-plus years. They know precisely where to look to understand what’s working and what isn’t.
GEO is maybe two years old, tops. The tooling is still being built, and frankly, a lot of what’s out there are half-baked dashboards slapped together to ride the hype cycle.
Yes, you need visibility tracking across AI platforms, i.e., where you’re showing up, for which prompts, how often, and who’s being surfaced instead of you. Citation patterns, competitive benchmarking. Everyone building in this space offers some version of that.
What most tools don’t give you is the “now what.” A dashboard tells you you’re invisible, but it doesn’t tell you why, what to fix first and how. You end up with a PDF full of data and no actionable next step.
This is what we built Writesonic to solve. Visibility tracking, yes, but also an Action Center that tells you what’s blocking citations and prioritizes fixes by impact. Dashboard plus execution layer. We’re biased, but do your own comparison.
An Effective Workflow
GEO is new enough that best practices are still being written. Which means your team is going to be experimenting. A lot. And experiments without documentation are more vibes than science. You need:
A way to track what you’re testing, what happened, and what you learned. A spreadsheet works fine, as long as there’s something more substantial than “I think we tried that and it didn’t work.” That isn’t institutional knowledge as much as a hunch that leaves the company when someone quits.
Ownership. Who’s responsible for prompt selection? Who handles content optimization when a gap is identified? Who’s monitoring visibility and flagging changes? GEO touches SEO, content, PR, sometimes product. If everyone assumes someone else is handling it, no one is.
None of this is complicated. You probably have similar processes for other channels already. The work is applying them to something new before the lack of structure turns into six months of unrepeatable, unscalable effort.
Time and Permission
Here we hit the uncomfortable point.
Your team can probably handle GEO. The question is whether they have the bandwidth to do it well, or whether it becomes another thing they squeeze in between everything else and half-heartedly do for six months before someone asks why it’s not working.
Marketing teams are already stretched as is. Adding GEO to the pile without removing something else means it’ll get the scraps—an hour here, a task there, no sustained focus. And GEO in its current state rewards sustained focus, teams who’ve carved out dedicated time to experiment, track, and iterate.
Permission to reprioritize is the actual blocker. Not capability. So you need to sit down and decide what you’re willing to deprioritize that isn’t driving business value.
When Hiring Actually Makes Sense
There are a few edge cases where hiring for a GEO-focused role might make sense:
You’re Enterprise-Scale
Enterprise-scale businesses have multiple product lines, overlapping audiences, complex product documentation, and thousands of relevant AI prompts that need to be managed.
At this scale, doing GEO properly is a volume problem due to the sheer number of prompts and content pieces that need to be managed.
Even if you build a strong workflow, managing GEO at this scale will likely be too much for your existing SEO or content team.
In this case, hiring a dedicated person to own GEO can help you maintain consistency, prioritize work, and ensure you’re not missing out on any opportunities for improved AI visibility.
Your Team is Already at Capacity
Some teams don’t have the bandwidth to dedicate time to GEO without impacting existing channels that do drive value.
If your team is already at full capacity and there’s nothing you’re willing to deprioritize, then you’ll struggle to gain traction with GEO.
In this case, you’ll need to make a dedicated hire simply to increase your team’s capacity and be able to execute an effective GEO strategy consistently.
You Have the Budget to Experiment Aggressively
Some companies are in the fortunate position of having money to throw at emerging channels. If that’s you, the advantage isn’t necessarily a dedicated GEO hire, but more so speed.
Budget means you can run more experiments simultaneously and invest in better tooling earlier. It means your existing team can spend more hours on GEO without sacrificing other priorities, whether that’s through backfilling their current workload or bringing in freelance support for the grunt work.
That said, money doesn’t guarantee you’ll figure it out first. Plenty of scrappy teams with tighter budgets are running smarter experiments and learning faster than enterprise teams drowning in process.
Budget is an accelerant but it won’t replace good thinking.
You Probably Don’t Need a GEO Team
The instinct to hire when something new shows up is understandable. It feels like action, like taking the channel seriously.
But GEO isn’t a completely new capability. Your team already has a lot of the necessary skills, poised to be honed into something that works on this surface.
What’s actually missing is the infrastructure. The visibility layer that shows you where you’re showing up and where you’re not, the prioritization that tells your team what to fix first. The connective tissue that turns “we’re not getting cited” into specific tasks for specific people.
Your team can do this. They just need the setup to make it happen.
If you want help figuring out what that looks like, we’ve built a lot of this into Writesonic, i.e., visibility tracking, gap analysis, prioritized actions for content, SEO, and PR. We’re happy to show you around.This is where Writesonic can help. Get in touch if you’d like to learn more.
Key Takeaways
UGC platforms own AI citations. Reddit, Wikipedia, YouTube, LinkedIn, Medium—seven of the top 10 most-cited domains are platforms where users create content, not publishers. You can’t build the next Reddit, but you can optimize your presence on it.
Two-thirds of domains only appear on one platform. 67.4% of the 2.4M domains we tracked got cited by exactly one AI platform. Just 6.5% achieved universal presence (5-8 platforms).
URL diversification is structural, not strategic. Reddit has 678,255 unique URLs in our dataset. Wikipedia has 111,823. That diversity comes from millions of users creating content daily. You’re not publishing 678,000 pages, but you can place high-value content where LLMs are already looking.
Query frames have monopoly winners. Reddit owns alternatives queries, Wikipedia owns “what is,” YouTube and Stack Overflow split how-to by technical depth. G2 and Capterra dominate B2B comparisons. The platforms people trust for specific question types are the platforms LLMs trust too.
Third-party presence isn’t optional. Owned content establishes your POV, but third-party presence is where citation volume lives. Treating it as a nice-to-have means you’re ignoring where the game is being played.
I wanted even more specificity: which domains dominatethese citations? Who’s publishing the listicles that LLMs so love to surface? Who owns the reviews that these platforms trust?
I ranked 2.4 million domains by how often they get cited across eight AI platforms. Here’s what the top of the list looks like:
Reddit
Wikipedia
YouTube
LinkedIn
Medium
You can quickly spot the commonalities. These are user-generated content platforms, aggregators. Community spaces where millions of people create millions of pages.
That pattern tells us something important about how LLMs source information—and where the leverage points are for AI visibility.
How I categorized platform strategies
I classified the 2.4 million domains based on how many platforms cited them during the study period (May 2025 to October 2025):
Universal domains (5-8 platforms): Domains cited by at least five of the eight platforms we tracked. These are the generalists showing up regardless of which AI tool someone uses.
Multi-platform domains (2-4 platforms): Domains appearing on two to four platforms. They have cross-platform presence but aren’t ubiquitous.
Single-platform domains (1 platform): Domains cited by a single platform during the study period.
The distribution:
Universal (5-8 platforms): 155,734 domains (6.5%)
Multi-platform (2-4): 622,323 domains (26.1%)
Single-platform (1): 1,606,864 domains (67.4%)
Two-thirds of all cited domains appear on exactly one platform. Just 6.5% achieve universal presence.
Finding #1: Universal domains are UGC aggregators (and you can’t compete with that directly)
Here’s the top 10 list of universal domains by total citations:
reddit.com – 7,328,267 citations across 7 platforms
wikipedia.org – 4,289,547 citations across 8 platforms
youtube.com – 2,661,056 citations across 7 platforms
google.com – 1,652,610 citations across 8 platforms
linkedin.com – 1,424,134 citations across 8 platforms
g2.com – 1,219,726 citations across 8 platforms
medium.com – 1,157,881 citations across 8 platforms
forbes.com – 1,155,981 citations across 7 platforms
nih.gov – 974,124 citations across 8 platforms
zapier.com – 956,337 citations across 8 platforms
There’s no sidestepping the pattern. Seven of the top 10 are platforms where users create content, not publishers creating their own.
Reddit aggregates community discussions, Wikipedia aggregates crowd-sourced knowledge., YouTube aggregates user videos, G2 aggregates reviews and so on.
Even the exceptions lean on aggregation. Forbes has contributor networks while Zapier publishes integration guides and user-submitted workflows. The NIH hosts research papers from several authors.
The domains achieving universal AI presence are structured to aggregate millions of contributions from millions of users across millions of topics.
You can’t build the next Reddit. Neither can I. That ship sailed 15 years ago (and required venture funding and a tolerance for chaos that most businesses don’t have).
But—and this is the important part—you can optimize your presence on Reddit. And Wikipedia. And LinkedIn.
Finding #2: The citation gap is massive (and it tells us what LLMs trust)
Universal domains get cited 26 times more than multi-platform domains and 182 times more than single-platform domains.
This is yet another data point showing us that LLMs heavily favor user-generated content and community wisdom when answering queries, especially decision-oriented ones. These sites are structured to provide the exact format of information LLMs trust: community-vetted, multi-perspective, experiential content.
This aligns with what we already knew from the school of SEO: third-party signals are important. In the AI era, “off-page” just takes on renewed importance. You need to have a consistent, ironclad presence on the third-party platforms AI systems already perceive as aggregators of truth.
Finding #3: URL diversification is a structural outcome of UGC
One of the clearest patterns separating universal domains from everyone else is that they have tens of thousands—sometimes hundreds of thousands—of unique URLs getting cited.
Reddit: 678,255 unique URLs cited in our dataset
Wikipedia: 111,823 unique URLs
YouTube: 366,197 unique URLs
LinkedIn: 205,055 unique URLs
Compare that to domains with over-concentrated citations (where 70%+ of citations go to a single URL):
Average unique URLs: 1
Average total citations: 1
Platform presence: 1 platform
Not surprising.
Reddit has 678,255 URLs because it has millions of users creating posts and comments every day across tens of thousands of subreddits. That diversity emerges from the inherent structure of the platform.
Wikipedia has 111,823 URLs because it documents everything and relies on global contributors. YouTube has 366,197 because millions of creators upload videos.
These platforms win on diversification because they’re designed to aggregate. Every new user, post and video is a new potential citation target.
You’re not going to publish 678,000 pages (and if you tried, most of them would be low-quality filler). But you can create strategic content on these platforms:
A well-optimized Reddit comment thread in a relevant subreddit
A detailed Wikipedia page
A YouTube video addressing common questions in your space
A LinkedIn article establishing thought leadership.
Your focus isn’t to match UGC platforms on volume, but rather place high-value content where LLMs are already looking.
Finding #4: Query frames have distinct winners (and some are more monopolized than others)
Along with platform presence and URL diversity, there’s another dimension worth examining: which domains take over specific query types.
We tracked seven primary query frames based on how people ask questions:
Alternatives: “[Brand/product] alternatives”
Comparison: “[X] vs [Y]”
How-to: “How to [action]”
List/Best: “Best [category]”
Pricing: “How much does [X] cost”
Troubleshooting: “[Problem] not working”
What is: “What is [term]”
For each frame, we looked at which domains get cited most often and whether those citations cluster around specific players or distribute more evenly.
The pattern is stark. Some query frames are monopolized, while others are wide open:
Alternatives queries: Reddit
“What is” queries: Wikipedia
How-to queries: YouTube + Stack Overflow (split by technical depth)
Comparison queries: G2 and Capterra (B2B) / CNET and TechRadar (consumer)
Pricing queries: Less concentrated – G2/Capterra lead B2B but distribution is wider
List/Best queries: More distributed, still UGC-heavy
Reddit’s dominance in alternatives queries is particularly interesting, though not surprising. People asking for alternatives want real user experiences and Reddit delivers—as far as LLMs are concerned, better even than G2.
This pattern repeats across frames:
“What is” queries want encyclopedic definitions → Wikipedia wins
It’s not news that you need to care about more than just your owned channels to succeed in AI search. But the sheer magnitude probably is.
Universal domains get cited 182 times more than single-platform domains. And those universal domains are almost exclusively UGC aggregators: Reddit, Wikipedia, YouTube, LinkedIn, Medium.
This isn’t a sign to abandon owned content. It’s telling you that third-party presence is where the bulk of citation volume lives and treating it as a nice-to-have instead of a strategic imperative means you’re ignoring where a lot of the game is being played.
You need both a really good foundation of owned content and really good third-party hygiene.
Your owned content establishes what you do and how you do it from your POV. Product pages, documentation, blog posts and case studies are the structure. But LLMs don’t just pull from your site when someone asks about your category. They pull from Reddit threads comparing tools in your space, G2 ratings, third-party listicles and on and on it goes.
Even when those third-party mentions don’t get directly cited, models are pulling information from them to form their understanding of your brand. When they recommend you in certain contexts or position you against competitors, they’re drawing on everything that’s out there about you, not just what you publish.
You can’t control every mention, but you can influence the narrative through judicious presence on UGC platforms, engagement with review sites, partnerships with publishers and monitoring what’s being said in spaces where your audience is active.
That’s where the citations are. That’s where the broader information ecosystem that forms brand understanding lives.
Methodology
Data collection period: May 2025 to October 2025
Platforms tracked: ChatGPT Plus, ChatGPT Pro, Claude 4 with search, Perplexity (free tier), Gemini, Gemini with search, Google AI Mode, Grok, and Copilot.
Citation logging: We tracked every domain cited in AI responses, including URL-level granularity, query metadata, and timestamp data. Any domain appearing at least once during the study period was included in the dataset (n=2,384,921).
Due to dataset size, we analyzed summary statistics across all 2.4M domains and pulled detailed examples from the top 1,000 performers in each category. For consistency and URL diversity analyses, we filtered to domains with at least 100 citations to focus on meaningful patterns rather than outliers. For frame analysis, we examined the top 50 domains per frame type.
Key Takeaways:
LLMs pull 60% of their citations from outside the top 10 SERP results. That means your traditional SEO playbook only covers part of the story now.
User-generated platforms dominate AI citations. Reddit had 7.3 million citations in our analysis. Wikipedia had 4.3 million. Seven of the top 10 most-cited domains are UGC platforms.
Third-party placement is relationship work, not commodity content. You need to know who maintains the content, craft personalized outreach that adds value first, and build trust over time. This higher barrier to entry means better margins.
The work breaks into two phases: audit your client’s current visibility across citation sources, then execute strategic outreach to listicles, review sites, Reddit threads, and syndication partners.
Manual monitoring doesn’t scale. Checking hundreds of listicles monthly is manageable. Tracking Reddit, Quora, and Medium discussions daily for 10+ clients isn’t. Writesonic’s Action Center automates discovery and provides prioritized outreach lists so you can focus on the high-touch relationship work.
Our concept of what a website is hasn’t fundamentally changed from the first, static Dreamweaver-built monstrosities of yesteryear.
It’s a business card. You want yours to be classy, like a pale nimbus background with raised lettering. Maybe a Silian Rail font.
And every CMO feels understandably protective of that business card. Understandably, content agencies have built an entire industry on supporting this reflex. For the last couple decades, the core toolkit has barely changed: you go in and clean up the technical SEO, deliver a content strategy, fire up the blog machine.
There have certainly been upheavals along the way. The gated gardens of social media corralled many eyeballs into just a few isolated pastures within the internet’s great expanse. You could no longer completely control the message, but it was close enough you could still say you “owned” your marketing assets and keep a straight face.
Your website, your blog, and your social channels. That is the holy trinity of traditional agency optimization.
For a long time, the rules were stable. Google rewarded the signals you could actually shape. Things like your site quality and your content depth and your link authority. So, agencies built entire workflows around those signals. You knew what mattered, and you knew how to move the needle.
But LLMs … don’t really care about any of that. They scour the rest of the internet. And that’s scary because neither you, nor your clients, control all those other parcels of land.
LLMs might key in on some Reddit thread from 2019, or a years-old Capterra review, or a Substack article that’s part product comparison and part anecdote.
These are the things that end up feeding the model’s understanding of your client. And if these third-party sources talk about your client loudly, or incorrectly, or (worse!) not at all, that’s what the AI parrots. It’s scraping consensus signals, and those signals no longer live on the landing page.
All the stuff you’ve been doing still works, of course. This is not an “SEO is dead” moment (just like the last time wasn’t. Nor the time before that). After all, 40% of AI citations are pulled from the top 10 SERP results.
But, it does mean 60% of citations aren’t covered in your traditional playbook. AI is like an expansion pack to the game content marketers are used to playing.
If you’ve been watching how LLMs answer questions, you’ve probably noticed they love certain corners of the internet. They keep circling back to places like Reddit, Quora, Medium, Hacker News.
In fact, nearly 22% of AI citations come from user-generated content like this. Our most recent analysis of 2.4 million domains across eight AI platforms found that seven of the top 10 most-cited domains are UGC platforms. Reddit alone had 7.3 million citations. Wikipedia had 4.3 million. YouTube, LinkedIn and Medium all follow the same pattern.
LLM models love these places not because they are polished or authoritative like you would expect from traditional EEAT content, but because they’re crowded with people talking in detail about problems and the solutions to those problems.
Sometimes, those conversations are inaccurate or outdated. But the model has no reason to correct them when it feeds you those conversations as answers. That, as we’ll explore below, is your job.
Once you understand where the model likes to pull from, though, you can put your marketing strategist hat back on, open up the playbook, and start updating the oldies but goodies based on this new generative engine optimization (GEO) battleground.
It requires relationship capital
This is where agencies earn their keep, because AI doesn’t do diplomacy (yet).
Publishers aren’t always going to want to hear your pitch. If you want them to update a listicle, it needs to be because it’s helpful first and foremost for them and their readers.
It’s the difference between:
“Hey, I noticed this line is outdated. Here’s the correct information, with a source to verify it. Hope this was helpful!”
and
“Here’s why my tool, EIE.io, is better for enterprise ag producers than the Old Mac app.”
This is time consuming, and it doesn’t scale. Fortunately, just gaining a few extra brand mentions can drive 1,000 new citations from LLMs, so we’re not talking about days and days of work here. You can afford to take your time and personalize your messages.
User-generated content platforms work differently, but the principle is the same: be authentic or bounce.
Hacker News readers have a sixth sense for brand plants. Reddit mods will gleefully vaporize anything that smells like a PR initiative.
Your presence in these communities has to be slow, steady, and, above all, helpful. That means adding value by answering questions and hardly, if ever, mentioning your client unless it is really, truly the best answer.
This is high-touch work that requires deep expertise. It isn’t commodity content work clients can get from Fiverr. Fortunately, that higher barrier to entry equates to better margins for your agency.
There are always going to be new listicles popping up in mentions while others fall out of date. Subreddits are always rehashing arguments, and those new threads get picked up by the LLMs, so you need to make sure your voice is in the mix there, too.
You can’t fix things once and expect your client’s narrative to stay put. Maintaining their reputation is a constant battle, which means it fits best within a retainer model.
In addition to monitoring your client monthly, there will be intermittent bursts of outreach work when something important changes. That makes it a stable, profitable service line.
It’s a hedge against SEO commoditization
Before SEO became a profession, nobody was thinking in terms of keywords or topic clusters. People wrote recipes, built fan pages, and wrote angsty LiveJournal entries but it all just sat out in the ether in a disorganized mess.
Half the magic of the old internet was how chaotic it all was. You’d search for something and stumble onto a blog and it felt like you’d discovered a secret world in the back of your wardrobe.
The downside, of course, was that finding anything specific was difficult. That’s what Google fixed. Backlinks as a proxy for authority was a brilliant idea that made the internet far more usable. But, it also kicked off the longest-running cat-and-mouse game in marketing: Google trying to surface genuine expertise, and everyone else trying to look like genuine experts.
Over time, the “right” way to write so you appeared atop the SERPs became fairly codified. You got your hub-and-spokes and ultimate guides. FAQs and key takeaways appeared in every article because they became part of a checklist.
Along the way, a lot of content ended up sounding exactly like its neighbors. Then AI showed up, trained on all that sameness, and turned the dial to max. If SEOs were already drifting toward a shared voice, AI took that voice and blended it into an even smoother puree, then made it cheap enough to crank that SEO smoothie out for pennies a pound.
So, now you’ve got two forces flattening content at once. Writers are adapting to Google’s preferences, and AI is learning from those writers. This templated stuff is now a commodity that can only compete on price. And competing on price is a race with only one destination: the bottom.
Third-party placement, on the other hand, is stubbornly human work.
You have to actually understand which sites matter in your client’s space, because they’re not always the ones with the highest domain rating. Then, you have to figure out who actually maintains that content and write outreach that actually captures their attention. That requires a level of category fluency that lets you position a client as the right answer for the page without overselling it.
This is how the best agencies will start to move up the food chain from mass production to strategic visibility work.
The operational framework
None of the work that goes into third-party visibility is mystical or hand-wavy. What surprises most people when they first dig in is how familiar the tasks actually end up feeling.
The terrain outside your familiar website SEO work may be a bit different. You’re not auditing title tags or rewriting H2s, but the instincts you honed doing that work will serve you well here.
Phase 1: The third-party visibility audit
In phase 1 for each of your clients, you will map:
Citation source categories
To understand your client’s citation footprint, you need to know which types of sources tend to shape the LLM’s understanding of their space.
This usually falls into recognizable buckets. There will be review aggregators like G2 and Capterra, listicles that rank the “best X tool for Y users”, trade pubs, Reddit communities, Wikipedia pages, and experts writing on Medium and Substack.
Different industries will have different gravitational centers. Some will have unusually active Reddit communities, while others have a well-respected industry newsletter.
You want to find where the client is mentioned, and then figure out whether that source talks about your client accurately, whether it’s a recent citation, and whether the mention is positive or negative. Also look for citations where your competitors are mentioned, but your client isn’t.
Now, research those sources a little more deeply. Figure out what their domain authority is and how often they are cited by LLMs.
This snapshot becomes the raw data set from which you’ll work going forward. It’s the baseline from which you’ll show your client how you’ve improved their visibility.
Gap prioritization
Once you’ve gathered information about the current state of play, your next step is to sort out what deserves your attention first. Not all gaps are created equal.
This step is like prioritizing keywords. Only, instead of search volume, difficulty, and intent you’ll use authority and influence to sort your priorities.
Every category is different, but in general you can sort into the following tiers:
High-priority pages like listicles in your client’s exact category, review sites where buying decisions happen, and Wikipedia pages for their category.
Medium-priority pages like adjacent category listicles that may have a broader reach but are less directly related. You can also throw mentions from industry publications and strategic Reddit communities into this bucket.
Lower priority pages like generic business listicles, low domain-authority directories, and thin content aggregators.
At the end of this phase, you’ll have a comprehensive third-party visibility audit alongside a prioritized opportunity list to deliver to your client.
Phase 2: Strategic Outreach
With a visibility audit and an opportunity list in hand, it’s time to execute.
This is the part that doesn’t scale neatly. That’s OK. It’s why your clients are willing to pay top dollar for your expertise, skills, and influence in the industry.
For listicle and review site inclusion
There are likely to be dozens and dozens of listicles and review sites that you could chase on behalf of your clients. We encourage you not to get sidetracked pitching every client to every list.
Instead, work to build relationships with high-authority sources within your agency’s verticals. As you prove yourself again and again to be a useful contact, you’ll build trust as a resource they can turn to when it comes time to update their content.
Here’s what the process looks like:
Identify the decision-maker. This is likely someone like a publication editor, a list curator, or the site’s owner. If there isn’t a byline, look for a contact in the masthead, or in an “about” section. You can also try and find someone on LinkedIn associated with the site that has “editor” or something similar in their title. One last source is to do a quick WHOIS lookup to reveal the domain registrar.
Research update frequency. If there isn’t a date attached to the blog, you can sometimes get clues from the screenshots they use within the blog, the features they highlight about each tool, or follow their outbound links and see how old those pages are.
Understand inclusion criteria. Look at who’s already on the list and how they’re described. Patterns in the entries usually reveal what the curator values, whether it’s pricing transparency, UI/UX, the availability of a free tier, or integrations. Whatever shows up again and again through the list is probably what they’re optimizing for.
Craft a non-spammy pitch. The key here is to be specific and to add value. Point to the exact line or section you’re hoping to update. Then, explain what’s changed and why this information helps their readers. Give them a clean, ready-to-paste version along with a verified source for the information.
Provide easy-to-use assets. Along with the copy and a source link, you might include screenshots, comparison data, and any other information that will help inform their readers.
Follow up strategically. Give your contact time to respond and make the changes. They’re busy, too, after all. After a week or two, it’s OK to give them a polite nudge in the form of an email that references your original note. If they don’t respond after this nudge, though, let it go. You don’t want to get a reputation as pushy or spammy.
For Reddit/Quora presence
It is so, so, so tempting to go full marketer mode here. Resist the urge at all costs. If you’ve chosen well, these communities are full of your client’s exact audience, so missteps will have outsized consequences for your client’s reputation.
We recommend identifying 5-10 high-value threads in a client’s category. Don’t go reviving dead threads. These should be live, active conversations. It’s better to be patient and wait for the right thread to emerge than to rampage through the subreddit like a bull in a china shop.
When you do engage, make sure it is to provide genuinely helpful answers, not pitches. Only mention your client if doing so actually answers the question in a useful way. Placement should be the cherry on top of your reputational sundae.
For Wikipedia
Wikipedia wants to know if your client has been covered in independent, reputable sources like mainstream publications, industry press, scientific research, or books. If all you’ve got are blog posts and press releases, Wikipedia doesn’t consider you notable, and that’s OK.
If your client does meet the bar for inclusion, then be sure to follow Wikipedia’s editing protocols strictly. What you write must be backed by a reliable, third-party source. You should summarize your client neutrally, without spin. Wikipedia will absolutely remove any promotional content.
For content syndication
First, let’s clarify what we’re talking about here. There’s “syndication” in the PR-network sense where you pay a few hundred bucks and end up splattered across 50 sites. That’s not what we’re talking about here.
The kind of syndication you want is the editorial kind. You want to find a partner where your client’s perspective will make the partner’s publication better for their readers. That’s because you’re going to be repurposing your client’s best-performing content (with their permission, naturally) for syndication.
You’ll know it’s the right kind of publication because it will include disclaimers that say “originally published in …”, usually at the top of the article. What these publishers are looking for is usually primary research and analysis or perspectives that test the establishment view on a topic.
Look for editors that want a regular cadence of articles from your client. Done well, this kind of work raises the publisher’s prestige while also framing your client as an authoritative thought leader in the industry.
Make sure to give the editor the canonical link, an author bio, and the exact copy they should use to credit your client so everything is attributed appropriately.
Writesonic does this for you
If you’ve made it this far, then you probably agree the opportunity for agencies here is pretty huge.
You may also have identified a problem.
Most any agency could offer third-party placement. Agencies are pros at running audits, building relationships, pitching editors. The work isn’t the issue.
But systematically identifying where your dozens of clients should be mentioned across hundreds of potential sources is a bottleneck that sounds like it would probably kill this service before it ever launched.
What you’d need to do manually
Let’s take a look again at the work involved in building a client’s third-party invisibility.
First, you’d have to check hundreds of high-authority listicles in each client’s category to see if they’re mentioned. You’ll need to cross-reference the information in those listicles against your client’s latest feature updates to identify errors that need to be corrected or updated. That’s laborious, but you could get away with doing this just monthly. So, this alone is probably manageable.
Unfortunately, monitoring communities like Reddit, Quora, Medium, and Substack has to be done almost daily or you risk missing relevant discussions as they happen in real-time.
Now, remember you have to do all this work not just for client mentions, but for all their competitors as well.
For one client, this is 10-15 hours of research. For a roster of 10+ clients, it quickly becomes untenable.
What Writesonic’s Action Center Does
Instead of spending a week shining a flashlight into every nook and cranny of the internet hoping to spotlight opportunities, Writesonic’s Action Center gives you a dashboard.
On this dashboard, you’ll see where the gaps are. It shows:
High-domain-authority listicles and review sites where your client isn’t mentioned (but should be)
UGC forums (Reddit, Quora, Medium) with relevant discussions where they’re not present
Wikipedia pages in their category where they’re missing
Places where they ARE mentioned but with incorrect or outdated information
Writesonic also gives you the data you need to sell the service:
Specific URLs of placement opportunities
Contact information (name and emails) of the editors/owners
Domain authority and citation frequency for each source
Competitive gap analysis
A prioritized list of sources to pursue based on citation patterns we’re seeing across AI platforms and what will have the most impact
So, instead of 15 hours of manual research per client, you’ll automatically receive a comprehensive, ready-to-present audit report and a prioritized outreach list.
In addition, to those high-value client deliverables, you can use the Action Center to automatically monitor mentions of both your client and their competitors. That way, those deliverables remain living, useful documents instead of dusty PDFs rotting in some forgotten folder on your shared Google Drive.
You’ll be able to provide monthly reporting to clients, and have an ongoing list of outreach tasks to pursue on their behalf. All of this slots neatly into your recurring monthly revenue model.
Without this tool, you’re looking at manual tracking that produces inconsistent data because it depends who on your team does the research. There’s no systematic way to prioritize your outreach because it comes down to the researcher’s gut feel on what matters. And, you’ll struggle to prove whether placements are actually improving AI citation rates.
The Action Center offers infrastructure that automates discovery across citation sources, provides a consistent methodology you can use for all your clients, delivers high-value data for your clients and ROI tracking that shows the value you bring to the table.
If you want to see what this looks like for your clients’ categories – what placement opportunities exist, where competitors are mentioned, what sources you should go after – we’ll give you a walkthrough.
Book a demo and we’ll pull a real audit for your space so you can see how big the opportunity really is.
Key Takeaways
Listicles win regardless of query type. 20-32% citation share across every frame tested. Troubleshooting queries show the lowest listicle performance (19.74%), and that’s still nearly 1 in 5 citations.
Intent matching works, but it’s not surprising. Pricing queries pull pricing pages 5.88x more than baseline. Comparison queries pull comparison content 3.34x more. Alternatives queries surface competitor pages 5.37x more. Platforms respond to explicit intent signals the way you’d expect.
Reviews explode in troubleshooting contexts. 8.9x lift—the most extreme multiplier in the dataset. Reviews jump from <1% to 7.87% of citations when users search for fixes and bugs. Likely explanation: reviews mention problems users encountered, and platforms match those snippets to troubleshooting queries even though reviews don’t actually solve anything.
Product pages beat pricing pages in pricing queries. Pricing pages get a 5.88x lift, but product pages still capture more total citations (8.64%). Platforms prefer comprehensive context over isolated pricing information.
Welcome back to our AI search lab. Last time, I analyzed LLM citation patterns in branded vs. non-branded prompts. This week, I wanted to find out whether divergence in query framing—how to do X, Product A vs Product B, what is Y, best tools for Z—produces meaningful changes in what gets cited.
The data landed somewhere between “mostly predictable” and “why is that happening?”
The assumption going in was that platforms would heavily adjust citation patterns based on intent. If someone’s asking how to do something, they’d prioritize tutorials. If someone’s comparing products, they’d surface comparison content. Basic intent matching.
The reality is more subtle than that. Grab some coffee while I break down the best insights and what they mean for your GEO strategy.
Finding #1: Listicles stay dominant everywhere
Listicles account for 20-32% of citations across all query types. That’s a 1.6x range, which is basically nothing compared to most content types.
What is queries: 31.79%
List/best queries: 32.05%
Comparison queries: 30.06%
Alternatives queries: 28.48%
How-to queries: 27.50%
Pricing queries: 25.54%
Troubleshooting queries: 19.74%
Even in troubleshooting contexts, where listicles perform worst, they still capture nearly 20% of citations. This matches what we saw in the industry analysis and the branded query study: listicles work everywhere. Query framing changes a lot of things, but it doesn’t dethrone listicles as the format platforms default to.
This is good news if you’re already publishing them. It’s also confirmation that you can’t ignore them just because your vertical feels “different.”
Finding #2: The obvious matches are mostly what you’d expect
Platforms are reasonably good at matching content to explicit intent.
Pricing queries pull pricing pages 5.88x more than baseline (0.61% → 3.57%)
Comparison queries pull comparison pages 3.34x more (1.45% → 4.85%)
Alternatives queries pull competitor pages 5.37x more (0.15% → 0.82%)
These aren’t shocking revelations, but they’re worth confirming. When users explicitly signal their intent, platforms respond accordingly. If someone searches “Writesonic pricing,” they’re getting pricing pages. If they search “Writesonic alternatives,” they’re getting competitor comparison content.
The lifts are consistent across platforms too.
Finding #3: Reviews explode 8.9x in troubleshooting queries
Reviews account for less than 1% of citations in most contexts (0.88% baseline). In troubleshooting queries, they jump to 7.87%.
That’s an 8.9x lift and the most extreme multiplier in the entire dataset.
When users search “why isn’t Slack loading my messages” or “Zoom freezing during calls,” platforms prioritize review content over nearly everything else. Reviews jump from less than 1% of citations to almost 8%.
This doesn’t make obvious sense. Reviews aren’t troubleshooting guides, they’re product evaluations. Why would they be relevant when someone’s trying to fix a problem?
A possible explanation (and my best guess) is that reviews often mention bugs, issues and problems users encountered. If someone leaves a review saying “great product but crashes on mobile” or “love it except for the sync issues,” that content might match troubleshooting queries. Platforms could be pulling review snippets where users describe similar problems, even if those reviews don’t provide solutions.
But that’s just a hypothesis.
Other troubleshooting lifts:
FAQ pages: 2.83x
Case studies: 2.07x
Press releases: 1.82x
The FAQ lift makes sense as they address common issues. Press releases might surface because companies announce patches and fixes. But as for why case studies lift 2x in troubleshooting contexts, that’s another interesting conundrum.
What’s undeniable is the review lift. Whether that’s good content matching or platforms struggling to find actual troubleshooting guides is an open question.
How-to queries show the expected preferences for educational content.
FAQ pages: 1.76x lift (0.43% → 0.75%)
API documentation: 1.60x lift (0.15% → 0.24%)
How-to docs: 1.39x lift (6.66% → 9.22%)
Nothing wild here. Platforms distinguish between “teach me” and “help me decide” intent. How-to queries suppress comparison pages (0.20x), competitor pages (0.14x) and reviews (0.30x).
Finding #5: Pricing queries surface product pages over pricing pages
Pricing pages get a 5.88x lift in pricing queries (0.61% → 3.57%), which makes sense. But product pages get cited at 8.64% in pricing contexts, significantly outperforming dedicated pricing pages.
We’re seeing a similar pattern as we did back with branded vs non-branded queries. In contexts where you’d assume pricing pages to be the go-to choice (branded prompts and pricing queries), LLMs prefer comprehensive product pages with context, feature explanations and pricing together rather than pricing in isolation.
Meanwhile, competitor pages don’t move in comparison queries (0.17% baseline → 0.17%). You’d think “Slack vs Teams” would prioritize dedicated competitor comparison pages, but platforms prefer broader comparison pages (4.85%) that analyze multiple options rather than binary matchups.
Platform biases are there but they don’t dominate
Most platforms follow similar patterns, but a few show distinct preferences.
Claude over-indexes on competitor pages
Competitor pages get 4.08x over-representation in what-is queries and 3.87x in list queries on Claude. When users ask “what is Writesonic” or “best project management tools,” Claude disproportionately pulls competitor comparison content.
ChatGPT prefers case studies
Case studies get 1.66-1.80x over-representation across multiple frame types on ChatGPT. No other platform shows this preference. If you’re publishing case study content, ChatGPT is your best distribution channel within AI search.
Grok favors aggregator roundups
Grok cites aggregator roundups 1.57-2.00x more than average across nearly all query types.
Suppressions are bigger than lifts
Some content types get suppressed in specific contexts:
How-to queries crush comparison pages (0.20x) and competitor pages (0.14x). If users ask instructional questions, competitive content takes a hit.
Pricing queries suppress case studies (0.48x) and competitor pages (0.60x). Users looking at pricing don’t want narrative examples or competitive analysis.
These suppressions are often larger than the lifts. Comparison pages drop to 0.29% in how-to contexts from 1.45% baseline. That’s a 0.20x multiplier and far more dramatic than most positive lifts.
Intent-based optimization works in AI search the same way it works in SEO. Users signal what they want, platforms attempt to match that intent and specific content formats perform better in specific contexts.
Methodology: Analysis based on 282,828,738 citations across 7 frame types (what is, how-to, comparison, pricing, alternatives, troubleshooting, list/best) and 16 content types. Lift calculated as (frame % / baseline %) where baseline represents average citation rate across all frames. Platform biases calculated as (platform % / average %) for each frame-content combination.
Sky-Rocket Your Organic Traffic with AI-Assisted SEO