The model became a cost line

Jun 05, 2026What AI Broke

Legal Pricing Defensibility Models Strategy

The model layer is becoming a metered commodity, and this week four companies repriced their businesses around that fact. A coding tool that charged a flat monthly fee started billing per token. The largest backer of a frontier lab shipped its own models to stop paying for that lab's. A Chinese model matched the West on quality at a fraction of the price. And the labs themselves began climbing into the legal market they used to leave to others.

Flat-rate AI coding tools just started metering the bill

For three years the defensible read was that AI coding assistance was a flat, predictable cost. GitHub Copilot sold seats at $10 to $39 a month, and rivals matched the model. Engineering leaders budgeted AI tooling the way they budget any per-seat software: headcount times a fixed price, forecastable a year out. The pitch that AI makes every developer faster carried an implied promise that the productivity was bundled into a stable subscription line.

On June 1 GitHub retired Copilot's flat Premium Request model and moved every plan to usage-based billing, metering input, output, and cached tokens as "AI Credits" priced at one cent each, per GitHub's own announcement that day. Base seat prices held, but the heaviest users felt it at once. TechCrunch reported on May 30 that developers running long agentic sessions were seeing bills rise ten to fifty times, with documented cases moving from $50 to $3,000 a month. Code completion stayed free; the autonomous, multi-hour agent runs are what now carry a meter.

Anyone who budgeted AI development tooling as flat per-seat software should reopen the model. The cost of an agentic coding team now scales with how hard the agents work rather than with headcount, and the teams that took the "ten-times developer" pitch most literally are the ones the new meter punishes hardest. Resellers marking up a flat seat have nothing left to mark up. The move for an engineering leader this quarter: model your top decile of agentic users rather than your average seat, and treat inference as a variable input cost, the way you already treat cloud compute.

Microsoft built its own way out of OpenAI

The most defensible assumption in enterprise AI was that Microsoft's AI is OpenAI's. Microsoft had put more than $13 billion into the lab over the prior three years, wired its models through Azure, and built Copilot on top of them. The read for any operator was that the partnership was the foundation: choose Microsoft and you were choosing OpenAI's models with enterprise paperwork around them, and Microsoft had no reason to build a competing frontier stack of its own.

At its Build conference on June 2, Microsoft shipped seven in-house models trained on commercially licensed data, with no distillation from OpenAI's systems, per CNBC that day. The flagship, a reasoning model called MAI-Thinking-1, runs on a sparse architecture with a context window large enough to read a 600-page document in one pass. Chief executive Satya Nadella framed the rationale as "optionality": the company can run its own models on its own cloud, stop paying a third party, and pass the saving to developers as the price of leading models climbs.

Two groups should sit up. Enterprises that standardized on Azure specifically to guarantee frontier-model access now hold a platform whose owner is actively building substitutes for the thing they came for. And the lab itself just watched its largest distributor and investor ship a homegrown alternative and tell developers it will be cheaper. The broader lesson is the one the corpus has been circling since the model wasn't the moat: a model is swappable infrastructure. If the company that wrote the largest cheque to a frontier lab is building its own exit, treating your own model vendor as permanent is the riskier bet.

The cheapest capable model now comes from China

For two years the comfortable Western read was that the United States set both numbers that matter in AI: the price floor and the capability ceiling. Chinese labs were assumed to be capital-starved, dependent on US silicon they could no longer freely buy, and a generation behind on quality. An operator choosing an AI vendor could reasonably treat "best model" and "American model" as the same decision, and file Chinese open-weight releases as cheap copies fit for experiments rather than production.

Two things landed this week that break that read. DeepSeek neared its first outside funding round, about $7.4 billion, financed almost entirely inside China rather than by Western capital, per the South China Morning Post and Reuters on June 4. Its latest model now runs on domestic Huawei silicon rather than the Nvidia chips that US export rules increasingly put out of reach. On quality the gap is narrowing: Moonshot's open-weight Kimi K2.6 now ranks among the few strongest models in the world, closed or open, while the best American open model, released the same week, trails it, 54 to 48 on the independent Artificial Analysis index. The United States keeps a clear lead only on raw inference speed.

The exposed operators are the ones whose pricing rests on American models being both the best and the only serious option. US labs charging a scarcity premium are now undercut by a capable open model at a fraction of the token price, and that premium narrowed again this week, a point we made when the AI premium got marked down. The cost advantage doubles as a sovereignty advantage. The European Union's AI Act becomes fully applicable on August 2, and the European Commission used June 3 to publish a technology-sovereignty package favouring open-source and on-premise deployment. A cheap, capable, openly licensed model that runs inside your own walls is exactly what a regulated German or French enterprise now wants. The move: assume a near-frontier model is available at near-zero marginal cost from outside the US, and re-underwrite any plan that priced on American model scarcity.

The frontier labs are taking the legal vertical themselves

The settled view of how AI would reach professional services was a layer cake. Frontier labs would sell horizontal models; a layer of specialized software would wrap those models for each profession, from contracts to litigation to compliance, and sell the result to firms; and the labs would stay out of the application business, because building vertical products meant hiring domain experts and owning messy workflows they preferred to leave to partners. Legal technology, a global market worth more than $5 billion in 2026, was the showcase for that division of labour.

It cracked on June 1, when OpenAI formally launched a legal vertical and hired Jason Boehmig to run it, per Artificial Lawyer and Law.com that day. Boehmig co-founded Ironclad, one of the most prominent contract-AI companies, a business built on top of OpenAI's models and itself a former OpenAI customer; his stated title is now "building AGI for law." The hire follows Anthropic's launch of a dedicated legal product with established legal-data partners, and a comparable move from Microsoft. The pattern is three frontier labs deciding, within weeks of each other, to build the legal application themselves rather than sell the parts to someone who will.

The exposed party is precise: the vertical-software layer that took a frontier model, wrapped it for a profession, and called the wrapper a moat. When the lab can hire your founder and rebuild your product with privileged access to the underlying model, "we integrate the best model for lawyers" stops being a defensible position. This is the same break that hit the banking-analyst tooling we wrote about in AI displacement got specific, now arriving in law, and it will reach every profession with enough revenue to interest a lab. The move: if your product is a thin vertical wrapper, treat the lab as a competitor, not a supplier, and move toward the proprietary data, regulated workflow, and client relationships it cannot easily copy.

Read the four together

The four breaks rhyme. In each, a previously strategic asset, access to a capable model, got repriced toward commodity: metered to the developer, built in-house to dodge the bill, undercut from China, or bypassed by the lab climbing into the vertical above it. Raw model access is becoming a utility, and the margin is draining from it. The durable advantage, as we argued in defensibility in the AI era, sits in the layer a cheaper model cannot commoditize: the workflow, the proprietary data, the cost structure, the regulated relationship. For an operator, the question for the second half of 2026 is blunt. In your own business, what are you charging for that a metered bill, a Chinese open model, or a lab entering your vertical could erase by year-end? If the honest answer is "access to a good model," you are selling the one thing the market just decided it will not keep paying a premium for.