Why Agencies Building GEO Products Are Confusing Brand Visibility With Brand Relevance
The fastest-growing new category in agency technology is tools that track how your brand shows up in AI-generated answers. Havas built one using Claude Code and Replit, deployed it across nearly 100 countries in 60+ languages, and is now licensing it as a SaaS product to clients. Broadhead's VP of Product Innovation built a competitive intelligence platform - one that simulates consumer queries and compares how different AI providers rank your brand - in a single evening. Supergood's founder told Adweek that agencies will be "delivering more software than actual documents" within two years.
The category has a name: Generative Engine Optimization, or GEO. The tools are real, client demand is real, and the technical execution is genuinely impressive. Vibe coding has compressed a six-figure build into an overnight sprint. What agencies are delivering in days used to require quarters.
But there is something uncomfortable underneath the speed. Brand visibility in AI outputs and brand relevance to AI systems are not the same problem. And monitoring the first does not fix the second.
What GEO Monitoring Actually Tracks
The mechanics of GEO products are fairly consistent across agencies. You define a set of consumer queries relevant to your category. You run them through ChatGPT, Claude, Gemini, Perplexity. You count mentions, track ranking positions, measure sentiment, flag when a competitor appears where your brand should. Broadhead's version adds "audience persona layering" - running the same queries through different demographic filters to see if brand visibility varies by segment. Havas's Brand Insights AI does this across markets in 60+ languages simultaneously.
This is useful data. If your brand never appears in AI-generated answers for queries where it should logically appear, that's worth knowing.
The problem is what "something is wrong" actually points to. The GEO interpretation is that you have a visibility problem: your brand needs better positioning in training data, more authoritative citations, content structured for AI consumption. The fix is optimization - create more indexable content, cultivate citations from authoritative sources, structure data to align with how AI retrieves and synthesizes information.
But there is another interpretation: your brand isn't appearing because AI systems, which are very good at pattern-matching on internet-wide signals of trust and authority, don't have strong enough signals to recommend you. Not because you haven't been optimized. Because you haven't earned the recommendation.
The Memory Brand Parallel
Branding Strategy Insider recently published an analysis of what they call memory brands - brands with high awareness but low "active relevance." The pattern is distinct: consumers remember these brands; they just don't choose them. High unaided awareness. Low conversion. High distribution, declining incrementality.
The failure mode is what the author calls "just enough-ism" - minimal investment that proves continued brand existence without genuinely rebuilding meaning or momentum. It looks like activity. It involves real spending. But it addresses the symptom rather than the underlying problem: the brand stopped earning relevance.
The GEO trap rhymes directly with this. Monitoring how often AI mentions your brand, and optimizing for more mentions, is a form of just enough-ism. It is activity that looks strategic without addressing whether the brand is worth recommending in the first place.
Memory brands typically have strong distribution and name recognition long after they have stopped earning new relevance. In AI-generated search, the equivalent is brands that appear in training data as historical facts - AI systems know them, have processed millions of references to them - but they don't appear in AI recommendations because the signals associated with active consumer trust are thin. You can monitor that gap. You cannot optimization-engineer your way out of it.
Why Agencies Built the Wrong Tool
The behavioral economics research group BehavioralEconomics.com recently published a piece coining the term "Homobiasos" to describe a pattern identified by Guy Hochman of Reichman University: humans don't reason to find truth; they reason to defend what they already believe. Our cognitive machinery serves a deeper purpose than truth-finding - it protects our existing worldview.
This applies to organizations as readily as individuals. Agencies have built business models around measurement and optimization. Awareness metrics, share of voice, sentiment tracking, reach and frequency - the entire apparatus of traditional brand management is a measurement and optimization practice. When a new channel appears, the natural move is to extend the existing frame: measure it, optimize it, report on it.
GEO is that move, applied to AI search. It fits the existing mental model. It produces dashboards. It generates reports clients can review in quarterly business reviews. It creates a service line with a technology moat. Of course agencies built it.
The rationalization is structural: agencies are protecting a worldview in which measurement equals strategy, because that worldview is also their revenue model. The bias isn't carelessness - it's the inevitable result of seeing new problems through existing frames.
This is the same cognitive dynamic McKinsey describes when they note that organizations launching agentic AI tend to focus on individual use cases rather than redesigning entire domains. Use cases are easy to justify, easy to fund, easy to demonstrate. Domain redesign requires admitting that the underlying structure needs to change - and that is a harder conversation to sell.
What AI Systems Actually Reward
Here is the uncomfortable question GEO monitoring cannot answer: why does AI recommend anything?
AI language models generate recommendations by pattern-matching on training corpora that reflect internet-wide consensus about trust, authority, and relevance. When a user asks "what's the best CRM for a 50-person sales team," the AI isn't querying a brand visibility database. It is synthesizing patterns from millions of sources - reviews, comparisons, expert recommendations, user discussions, industry publications - to surface what the internet collectively treats as authoritative.
That synthesis rewards brand substance. Companies with genuine customer advocacy, authentic expert endorsement, consistent product quality, and real earned media appear more reliably in AI recommendations than companies that have optimized their content structure for AI consumption. The signal is harder to fake than traditional SEO ever was, because it is distributed across human behavior rather than centralized in indexable content.
This is why the personalization paradox in AI brand strategy runs deeper than most brands recognize. AI systems don't just read what brands say about themselves - they read what everyone else says, weighted by credibility. Brands that have built authentic relationships and genuine market authority get recommended. Brands with awareness but weak earned authority get observed but not advocated for.
GEO monitoring tells you the output: how often you appear. It doesn't tell you the input: whether you've earned the right to appear.
This is the kind of pattern STI's research tracks systematically - in every category where AI has become a significant purchase-decision touchpoint, the brands that win are not the ones that optimized visibility. They're the ones that built genuine authority, then showed up legibly in the formats AI systems can retrieve.
The Right Uses for GEO Data
To be clear: GEO monitoring tools have legitimate applications. They are genuinely useful for:
-
Competitive intelligence: Understanding which competitors AI treats as authoritative in your category is valuable signal. If a newer competitor consistently outranks you in AI recommendations for your core use cases, that is important information about where brand authority has shifted and why.
-
Content gap identification: If AI never mentions you in contexts where you have documented expertise, that might indicate a structure problem - your best work isn't reaching the formats AI systems retrieve cleanly. That is fixable.
-
Anomaly detection: Sudden changes in AI brand mentions can surface shifts in market perception before they appear in traditional survey data. This is early-warning value.
What GEO data cannot do is substitute for the harder work of building something worth recommending. A brand with deep customer trust, authentic earned media, and genuine market expertise will show up in AI recommendations without optimization - because the underlying signals are there. A brand without those foundations will not get there through content structure changes alone.
The Domain-Level Question Agencies Are Avoiding
McKinsey's framing about agentic AI is worth taking seriously in this context: organizations that extract real value from AI won't be the ones who launch optimization use cases. They'll be the ones who redesign entire domains around what AI actually enables.
For agencies, that means a conversation that is harder to sell than a GEO dashboard: whether the brands they manage have earned the substance that AI systems will eventually reward. That is a brand strategy audit, not a visibility audit. It asks different questions - about product quality signals, about the depth of customer relationships, about whether expert communities treat the brand as a reference point or as a also-ran.
The agencies building GEO products are right that AI search requires new capabilities. They are right that clients need support navigating the transition. Havas's global deployment and Broadhead's rapid iteration represent genuine technical competence.
But the tool that will matter most is not a monitor. It is an honest evaluation of whether a brand has earned the right to be recommended, independent of how well it has optimized for being seen. That audit is harder to build, harder to sell, and harder to run. It doesn't generate a weekly dashboard. It surfaces conclusions that require brands to change something real rather than add another line item.
It is also the only work that addresses what AI is actually measuring.
Before you build the monitor, ask whether there is something worth monitoring for. If you're evaluating where your brand actually stands against these criteria, our analysis tools can help surface what the optimization dashboards won't.