How should our budget split between ChatGPT and Copilot visibility?

If your buyers are technical and skew toward independent research, weight toward ChatGPT. If your buyers are inside Microsoft-heavy enterprises, weight toward Copilot. Most enterprise clients end up doing both, with the budget split usually skewing 60/40 in one direction based on buyer profile. ChatGPT visibility leans on Reddit, comparison content and quotable claims. Copilot visibility leans on Bing Webmaster health, trade press relationships, schema markup and freshly dated content. Both share the same foundations, but the emphasis differs.

Bing Chat vs ChatGPT: how their citation patterns differ

Q: Should we still pay attention to Bing Webmaster Tools in 2026?

Yes. Bing Chat and Microsoft Copilot ride on Bing's index, so pages indexed well in Bing Webmaster Tools tend to be cited more reliably in Copilot. We routinely find clients who have a clean Search Console picture but a messy Bing Webmaster picture, with stale sitemaps and crawl errors. Fix those before doing anything else for Copilot visibility. ChatGPT places more weight on its own retrieval plus its trained corpus, so the Bing angle matters less there. For enterprises where buyers use Copilot inside Microsoft 365, Bing health is structural.

Q: Why does ChatGPT cite more sources per answer than Copilot?

Citation counts diverge consistently. Copilot typically cites three to five sources per response. ChatGPT, for the same prompt, often cites seven to ten. The wider net on ChatGPT means more chance of being included for a given query, but also more competition for the top citation slot, which is the one that gets the user's eye and the click. We track citation rank, not just presence. Being source seven of ten on ChatGPT is functionally invisible. Being source two of four on Copilot drives traffic.

It is tempting to treat all conversational search surfaces as the same channel. They are not. Bing Chat, now mostly delivered as Microsoft Copilot, and ChatGPT cite sources in ways that are visibly different, and the differences shape what you should optimise for.

We run prompt audits across both surfaces every month for several clients. This post shares what we have actually seen, where the divergence shows up most clearly and how we adjust strategy as a result.

A quick framing

Both surfaces use retrieval-augmented generation. Both cite sources beneath their answers. Both pull from a web index, run a query, summarise and link out. From the user’s perspective they look similar. Underneath, the index, the ranking signal weights and the editorial overlays are different enough to produce different answers to the same prompt.

For a baseline understanding of how citation works, our piece on how LLMs choose what to cite covers the retrieval mechanics shared across surfaces. The differences below sit on top of that shared foundation.

Where the patterns diverge

Five places we see consistent divergence in our audit data.

Index source

Bing Chat and Microsoft Copilot ride on Bing’s index. ChatGPT, when it does live retrieval, uses its own search infrastructure plus partner data including Bing for some queries, but the dominant signal stack is OpenAI’s own. The practical effect is that pages indexed well in Bing Webmaster Tools tend to be cited more reliably in Copilot, while ChatGPT places more weight on what its own retrieval picks up plus its trained corpus.

If you have not paid attention to Bing in years, that is a gap. We routinely find clients who have a clean Search Console picture but a messy Bing Webmaster picture, with stale sitemaps and crawl errors. Fix those before doing anything else for Copilot visibility.

Source format preference

ChatGPT tends to cite a wider mix of sources, often pulling from Reddit, GitHub, Substack and other community platforms. Copilot tends to anchor more heavily on traditional editorial sources, including news sites, established trade press and aggregators like G2 or Capterra. We do not have a clean reason for the divergence, but the pattern is consistent across categories we audit.

The practical implication is that a B2B firm leaning into Reddit or community content will see citation share rise faster on ChatGPT. The same firm needs trade press and G2 strength to move on Copilot. Our piece on why Reddit is now critical to AI search citations covers the community side, and why G2 and Capterra matter more for AI than for SEO covers the aggregator side.

Citation count per answer

Copilot typically cites three to five sources per response. ChatGPT, for the same prompt, often cites seven to ten. The wider net on ChatGPT means more chance of being included for a given query, but also more competition for the top citation slot, which gets the user’s eye and the click.

We track citation rank, not just presence. Being source seven of ten on ChatGPT is functionally invisible. Being source two of four on Copilot drives traffic.

Recency weighting

Copilot, being tied to Bing, has stronger recency weighting on news and time-sensitive content. ChatGPT shows recency weighting on its live search but its trained corpus pulls toward older, more established content. For a prompt like “what is the current state of ”, Copilot pulls newer sources. For “best tools”, ChatGPT pulls older roundup posts more often.

This is one of the quirks where strategy diverges sharply. If your category moves fast and you are publishing frequently, Copilot is the easier surface to win. If your content is established and well-cited historically, ChatGPT’s training data favours you.

Branded vs unbranded behaviour

For unbranded prompts, the two surfaces behave reasonably similarly, with the source mix differences described above. For branded prompts, Copilot leans much more heavily on the brand’s own site, including specific landing pages, while ChatGPT often pulls from independent reviews and discussions even when the prospect named the brand directly.

This matters because the two surfaces are giving different answers to the same buyer at different stages. We cover the wider implications in our piece on how AI search shifts the branded/unbranded query split.

What we change in strategy as a result

The differences mean we run two parallel workstreams for clients with budget for both.

For ChatGPT visibility:

Reddit and community presence get high priority
Comparison and “alternatives” content is a key lever
The page-level optimisation focuses on quotable claims and clean structure
We monitor for training-data inclusion, which is slow but compounds

For Copilot visibility:

Bing Webmaster Tools health gets baseline attention
Trade press relationships and editorial inclusion matter more
Schema markup is weighted more heavily, in our experience
News-style or freshly dated content performs better

Both share the foundations covered in our AI search optimisation primer. The differences are about emphasis and where incremental budget goes.

A practical audit method

If you want to see this for yourself before committing to a programme, the audit takes a couple of hours. The shape we use:

Pull a prompt list of 30 to 40 queries that match your prospects’ actual language
Run each prompt through ChatGPT and Microsoft Copilot in clean browsing sessions
For each response, log the cited sources and the order they appear in
Tag each source by type, your own site, competitor site, G2, Reddit, trade press, vendor partner, other
Score each prompt for citation presence and rank across both surfaces
Look at where the patterns diverge

This is the foundation for everything else. Without it, you are guessing. Our walkthrough on auditing visibility across Copilot and ChatGPT covers the mechanics and the pitfalls in detail.

What we cannot tell you yet

Three honesty caveats.

Microsoft Copilot inside the Microsoft 365 stack behaves differently from Copilot on the open web. Inside Outlook, Word or Teams, it draws on tenant content alongside Bing search results, and we do not have a clean longitudinal view of how that affects external citation behaviour.

ChatGPT’s retrieval has changed multiple times in 2025, and the citation patterns we observed in March do not match what we see now. We treat all of this as a live picture, not a stable one.

Attribution from AI surfaces is still partial. Both surfaces drive traffic, but the analytics signal is fuzzy and we cannot always prove which citation drove which visit. We track what we can, including referrer patterns in GA4 and Cloudflare logs.

When to invest more in one than the other

If your buyers are technical and skew toward independent research, weight toward ChatGPT. If your buyers are inside Microsoft-heavy enterprises, weight toward Copilot. Most of our enterprise clients end up doing both, but the budget split usually skews 60/40 in one direction based on buyer profile.

For tracking how all of this lands in your analytics, our piece on tracking AI search traffic covers the GA4 and referrer setup we use.

If you’d like a second opinion on your AI search strategy across both surfaces, drop us a line. You can also see how we approach this work on our AI SEO services page or our SEO services for the foundational layer.