The AI Budget Line Item McKinsey Forgot: Containing Agents That Act Like Malware

There is a strange timing to two pieces of research published this week. McKinsey is telling CIOs they need to recalibrate technology budgets for the AI era - shift spend away from legacy infrastructure toward AI capability, generate maximum growth. And Harvard Business Review is publishing research showing that AI agents exhibit behavioral patterns nearly identical to malware: privilege escalation, lateral movement across systems, data persistence and exfiltration.

Most enterprise technology teams will read one of these. Almost none will read them side by side. That's where the expensive mistake lives.

McKinsey's Framework and the Budget Category It Leaves Out

The McKinsey research addresses a real problem. CIOs have spent the last two years adding AI spend on top of existing infrastructure costs, creating bloated budgets without proportional returns. The prescription is rational: move faster on legacy deprecation, concentrate AI investment in high-value capabilities, reallocate expenditures toward areas with measurable growth impact.

What the framework doesn't model is the security overhead of the systems you're buying.

Traditional enterprise software - databases, ERP systems, communication platforms - operates in a well-understood threat surface. Security teams know what it looks like when these systems are compromised. They have incident response playbooks. They have detection rules. They've been hardening these environments for decades.

AI agents operate differently. They're designed to take autonomous action across systems, make decisions without explicit human approval for each step, and chain together operations in ways that aren't fully predictable. This is what makes them useful. It's also what makes them look, from a security monitoring perspective, like an attacker who has already gotten in.

When Your AI Agent Looks Like an Intruder

The HBR research details the specific behaviors that create this problem. AI agents in enterprise environments will request elevated permissions to complete tasks. They'll move laterally between systems to gather context. They'll write data to memory stores that persist between sessions. They'll make API calls to external services as part of normal operation.

Every item on that list is also on standard intrusion detection checklists. Security teams have spent years training models to flag exactly these behaviors as indicators of compromise. The result is a collision: as enterprises deploy more AI agents, they're generating an environment where legitimate agent activity is nearly indistinguishable from an actual attack.

This isn't a theoretical concern. Security operations centers are already dealing with it. Alert fatigue from AI agent activity is suppressing genuine threat detection. Meanwhile, actual malicious actors are learning to hide in the noise - using the same behavioral patterns as legitimate AI agents to move through environments that have been implicitly trained to tolerate those patterns.

The budget implication is straightforward but largely unaddressed. Deploying AI agents at enterprise scale requires investing in a new security architecture - one designed from the ground up to understand agent behavior patterns, distinguish legitimate from malicious agent activity, and maintain human visibility into what agents are actually doing. Most technology budgets in the McKinsey recalibration framework don't include this line item. They're optimizing the AI deployment side while treating security as a fixed cost category.

This is the kind of second-order pattern STI's research tracks systematically - the gap between what a framework measures and what actually drives outcomes.

The Governance Architecture Gap

There's prior evidence for why this matters. Klarna's AI agent deployment is frequently cited as a success case because it delivers measurable operational savings at scale. What that narrative consistently underplays is the governance architecture running underneath - the audit trails, the permission boundaries, the human escalation triggers that determine what agents can and cannot do autonomously.

Klarna works not because of the AI models they're using, but because they built the containment layer first. The agents operate inside a defined envelope of authorized behavior. When something happens outside that envelope, humans are automatically brought back into the loop.

Most enterprise AI deployments skip this step. The McKinsey budget recalibration advice - concentrate investment in high-ROI AI capabilities - creates pressure to move fast. Governance architecture is slow, expensive, and doesn't show up as a capability on a product demo. It gets deprioritized. The cost shows up later, when an agent does something unexpected, or when an actual attacker successfully disguises their activity as normal agent behavior.

Budgeting for agent containment means treating it as infrastructure, not security overhead. The distinction matters because infrastructure gets capitalized and scaled; overhead gets minimized each budget cycle.

The Behavioral Interface Problem

The HBR research makes a second point that compounds the first. Even when organizations implement proper agent containment, human oversight requires interfaces designed for action, not just for monitoring.

This connects to research BehavioralEconomics.com published on dashboard design: the difference between dashboards that display data and dashboards that drive human decisions. Most enterprise security dashboards are built for the first purpose. They surface alerts, show system states, display activity logs. What they don't do is design for the specific cognitive conditions under which humans make good security decisions - which involve reducing decision fatigue, making the cost of inaction visible, and creating clear escalation paths that don't require switching contexts.

Apply this to AI agent oversight: a CIO who invests in agent containment infrastructure still has a problem if the humans monitoring that infrastructure are looking at a wall of agent activity logs and trying to pattern-match against threats they've never seen before. The oversight interface is part of the system. If it's not designed with behavioral science principles - progress visibility, clear status signals, friction that slows down inappropriate dismissal of alerts - the containment infrastructure doesn't deliver its full value.

This is the behavioral design tax that almost nobody budgets for. It's not glamorous. It doesn't have a product category name yet. But it's the difference between an agent governance system that exists on paper and one that actually keeps humans in meaningful control of what their agents are doing.

The Attention Problem at Scale

There's a signal worth noting from outside the enterprise technology world. Wise, the digital-first payments company, is currently running physical pop-up experiences to drive current account sign-ups and shift consumer perception. The explicit framing from their marketing team: a "war for attention" that digital channels alone can't win anymore.

The relevance to AI agent governance isn't obvious but it's real. Wise is discovering that the volume of digital signals has crossed a threshold where adding more digital touchpoints no longer compounds attention - it dilutes it. Physical presence creates a different quality of attention, one that's harder to automate away.

Enterprise security teams face an analogous problem. The volume of AI agent activity will grow faster than human attention can scale. As with the broader attention economy shift, the winning strategy isn't to generate more alerts or build more comprehensive monitoring dashboards. It's to design systems that use human attention surgically - at the moments when human judgment is genuinely needed, with context prepared in advance, and with friction that makes defaulting to inaction more costly than engaging.

This is a systems design problem, not a technology procurement problem. And it doesn't appear anywhere in standard technology budget frameworks.

What a Complete AI Budget Framework Actually Includes

The McKinsey recalibration research is correct about the core trade-off: money locked in legacy infrastructure is money not generating AI-era returns. The direction is right. The accounting is incomplete.

A framework that accounts for how AI agents actually behave in enterprise environments needs four components, not two:

AI capability investment - the models, APIs, platforms, and developer tools that generate AI-era functionality. This is what McKinsey is optimizing.

AI security and containment - the permission architecture, audit infrastructure, behavioral monitoring, and incident response capabilities designed specifically for agent activity patterns. This is what HBR is warning about.

Behavioral interface design - the human-facing oversight systems built with behavioral science principles to keep human decision quality high as agent volume scales. This is what dashboard research is pointing at.

Attention preservation - organizational policies and system designs that protect human attention for the decisions that actually require it, rather than distributing it across an ever-growing volume of automated outputs. This is what Wise is learning in a different context.

Most enterprise technology teams are investing heavily in the first category. A few are beginning to think seriously about the second. Almost none have budgets for the third and fourth.

The organizations that figure out all four categories will have a compounding advantage: their AI deployments will run faster because agents operate inside a trusted envelope, their security posture will actually improve as AI scale increases rather than degrading, and their human decision-making will stay effective as the volume of decisions requiring human input grows.

If you're mapping your organization's AI investment against these four categories, our analysis tools can help surface where the gaps are - including the ones that don't show up until something goes wrong.

The McKinsey research is a useful starting point. The HBR research is the other half of the same conversation. Organizations that read both, and budget for what they imply together, are making a fundamentally different bet than those who read one and ignore the other.

The AI Budget Line Item McKinsey Forgot: Containing Agents That Act Like Malware

McKinsey's Framework and the Budget Category It Leaves Out

When Your AI Agent Looks Like an Intruder

The Governance Architecture Gap

The Behavioral Interface Problem

The Attention Problem at Scale

What a Complete AI Budget Framework Actually Includes

Want more insights like this?

Related Articles

Publicis, Microsoft, and the $1.2 Billion Blueprint for Enterprise AI Integration

Monks Spent Two Years Building What Nvidia Now Uses to Sell Everyone Else

State Street's 'Fearless Girl' Strategy Shows What McKinsey's Affordability Crisis Actually Demands From Financial Brands