What AI Broke
Lab — early draft from Era Haus

AI's narratives broke from the inside

May 29, 2026What AI Broke

Four times this week, the parties most invested in an AI narrative acted against it. OpenAI's chief executive said his white-collar displacement forecast was "pretty wrong." An autonomous coding agent reached the regulated cores of Goldman Sachs and NASA. Google's growth fund and Nvidia's venture arm anchored the lab-neutral routing layer. A free tool stripped the safety guardrails off Meta's and Google's open-weight models in under ten minutes. Each break came from inside the narrative it broke.

OpenAI's chief walked back the white-collar story

The defensible read for 24 months was that AI would compress entry-level white-collar work first and visibly. The reading list was specific: Sam Altman's repeated public framing, the Goldman and McKinsey 2023-24 projections that put tens of millions of office jobs at risk, Klarna and Salesforce announcing AI-attributed cuts, and an OpenAI roadmap that paired model capability with agentic deployment. A senior operator planning 2026 hiring or capex who took the Altman timeline at face value would have thinned graduate analyst pipelines and pre-bought enterprise seats against a steep displacement curve.

At a Commonwealth Bank of Australia event in Sydney on May 26, Altman said he had been "pretty wrong" about the pace of white-collar elimination, told the audience he was "delighted to be wrong," and said he had expected more displacement of entry-level office work to have happened by now (per Euronews and the Straits Times, May 26). The walk-back came on the same trip in which OpenAI was selling capacity to Australian enterprises. The chief executive of the company whose product set was meant to drive the displacement is now publicly conceding the displacement is slower than he said, in the room of the customers buying against it.

Operators whose 2026 plans built in an Altman-curve displacement should reset two things. Human-capital plans that thinned junior pipelines on the assumption of imminent agent substitution will leave the bench short when the substitution does not show up as forecast. Capex plans that justified AI-platform spend on a labor-cost arithmetic the same lab's chief executive has now disavowed should be re-underwritten on output or revenue gains rather than headcount reductions. As we argued on May 11 in AI displacement got specific, displacement was never uniform; this week the company selling the uniform version stopped backing it.

The coding agent reached the regulated core

Through 2025 the defensible read on agentic coding was that the agents were demo-grade. Cursor and Copilot were augmentation; Devin was a marketing exhibit. Compliance, audit trails, and review cycles in regulated industries (financial services, aerospace, automotive, banking) kept autonomous agents out of the production loop. Procurement teams at the largest enterprises modelled AI coding as a developer-tooling line, not a vendor that booked direct enterprise contracts at scale. The view 12 months ago was that the gap from "useful assistant" to "autonomous engineer trusted by Goldman" would take two to three more years to close.

Cognition raised more than $1 billion on May 27 at a $25 billion pre-money valuation, co-led by Lux Capital, General Catalyst, and 8VC, with Founders Fund and Ribbit Capital in the syndicate (per TechCrunch and The Next Web, May 27). Annualised revenue moved from $37 million in May 2025 to $492 million in May 2026, and the company is targeting $1 billion annualised this year. Disclosed customers include Goldman Sachs, Mercedes-Benz, NASA, Santander, and several U.S. government departments. Cognition also said roughly 89 to 90% of its own code is now written by Devin. Eight months ago the same company priced at $10.2 billion post-money; the valuation more than doubled while the revenue line moved 13×.

Developer-tooling vendors selling "AI in your IDE" as the enterprise category should re-read the customer list — Goldman, NASA, Mercedes, and Santander did not buy a faster autocomplete, they bought an agent. Consultancies pricing AI-coding adoption as a six-quarter integration program should look at how directly Cognition booked these contracts. Operators running their own engineering organisations should pressure-test the read from AI displacement got specific — total developer employment held while integration workload expanded — against the possibility that the integration tier is now the one compressing. Move: budget against direct agent procurement by your chief information officer; the dev-tools committee is no longer the buyer.

Google and Nvidia bought the lab-neutral layer

For two years the defensible enterprise-AI architecture was: pick a lab, integrate its API, build moats by going deep on one stack. The labs encouraged this with usage discounts, prosumer tiers, and exclusive features. Distribution belonged to the lab; routing was a customer-side problem an engineering team solved. The view six months ago was that aggregator and router layers would stay niche developer infrastructure, useful for hobbyists running side-by-side comparisons and immaterial to enterprise procurement. Google, OpenAI, and Anthropic were the platforms; aggregators were thin wrappers.

On May 26, OpenRouter closed a $113 million Series B at a $1.3 billion valuation, led by CapitalG (Alphabet's growth fund), with participation from NVentures (Nvidia's venture arm), Snowflake Ventures, Databricks Ventures, ServiceNow Ventures, and MongoDB Ventures, alongside existing investors Andreessen Horowitz and Menlo Ventures (per TechCrunch and BusinessWire, May 26). Weekly token volume reached 25 trillion, a 5× rise in six months, across more than 400 models and roughly 8 million users. The capital structure does the work: Google's own growth fund and Nvidia's venture arm now sit on the cap table of the layer that lets enterprises route around them, and Snowflake, Databricks, ServiceNow, and MongoDB are funding it next to them.

Frontier labs whose enterprise pricing assumed single-vendor lock-in should re-model against a routing layer their largest backers are now funding. Wrapper businesses whose moat was "we integrate one model deeply" should look at OpenRouter's 400-model surface and ask what they add that the router does not. Procurement teams at mid-market enterprises should treat multi-model routing as the default architecture rather than a hedge. This sharpens the read from our May 6 piece The model wasn't the moat: even the labs' own investors are now monetising the distance between buyer and lab.

Open-weight safety came off in under ten minutes

The defensible position 12 months ago on open-weight AI safety was that Meta's Llama and Google's Gemma shipped with alignment training meaningful enough to slow real misuse. Refusal behaviour, red-teaming, and constitutional constraints were treated as legitimate safety layers, friction sufficient for policy purposes and for the labs' own published safety cards. Regulators in the EU and U.S. wrote rules that leaned on this assumption. Enterprises building on open weights treated the published safety posture as part of inherited assurance. The position from inside both companies' public communications was that responsible open-weight release was an achievable bar.

On May 25, the Financial Times, with AI safety research group Alice, published an investigation showing that a free GitHub tool called Heretic strips the safety alignment off Meta's Llama 3.3 and Google's Gemma 3 in under ten minutes on a standard laptop, using a technique called abliteration that targets the isolated refusal-behaviour pathways alignment training creates. The tool's author told the paper he removed Gemma 4's guardrails 90 minutes after public release. The cited deployment scale: roughly 3,500 stripped-model variants distributed across model hubs with 13 million cumulative downloads. Outputs from the modified models include refused categories (biological-weapon recipes, malware generation, child sexual abuse material) that the originals declined.

Meta and Google's policy teams now have a published, replicated number contradicting the safety claims they have been making to regulators in Brussels, Washington, and London. Enterprises that adopted Llama or Gemma on a "responsibly aligned" thesis should treat the published safety card as marketing rather than control. Cyber-insurance underwriters pricing premiums on open-weight deployments should re-rate. As we argued in defensibility in the AI era, the durable moats are sequenced and earned, not picked from a menu. Move: treat the open-weight base model as raw material; the safety layer is now the deploying organisation's problem.

Read the four together

One pattern runs through the four. In each, the party most invested in keeping the prior narrative alive (the lab CEO, the developer-tooling category, the model lab, the open-weight publisher) acted against it. Sam Altman undercut his own labor forecast in front of paying customers. Cognition booked the regulated core that augmentation-only vendors said was three years out. Google and Nvidia put capital into the routing layer that disintermediates them. Meta's and Google's safety posture came off in under ten minutes, on a tool with thirteen million downloads. The operator question for June is narrower than usual: for each AI narrative your 2026 plan depends on, name the party still actively defending it. If the answer is "the same parties that broke this week's stories," the plan needs a second source.