The expectation was that AI would streamline, slim down, and cost less. Whereas, within some of the world’s biggest tech firms, the opposite is taking place. When used on a large scale, AI coding assistants can actually be more expensive than human workforce, as Microsoft and Uber find out. But the economic consequences of token-based pricing are leading to a silent, albeit important, retreat from the technology that both companies promoted for their employees’ use.
Microsoft has recently phased out most of its direct Claude Code licenses and put engineers into GitHub Copilot CLI. The turnaround is just six months after the tech giant opened access to Claude Code to thousands of its developers, project managers, designers and others. The company had been pushing a lot of experimentation with AI coding and responses were fast and positive. It was—may be—too quick. So much so, that Microsoft is now backing off a technology that it had become reliant on by its own employees. The move doesn’t impact Microsoft’s overall business partnership with Anthropic, The Verge reports. The company’s Foundry deal (in which up to 5 billion dollars will be invested in Anthropic) has not been impacted, nor have been Anthropic’s 30 billion dollar investment in Azure compute capacity.

Microsoft is by no means alone. In April, Uber’s chief technology officer (CIO), Praveen Neppalli Naga, told The Information that the ride hailing company had used up all its 2026 AI coding tools budget in just four months of the year. The revelation comes at a time when Uber has been aggressively encouraging ride-hailing companies to use the AI tool and has set up internal team leaderboards to see how well they’re doing. The trend in both companies suggests that there is an underlying conflict that hasn’t been the focus of much conversation on the subject of AI in the workplace. As long as firms push employees to use the technology, the more costs add up.
The crux of the issue is the pricing of AI computing. The pricing model for large language models is the per token basis, which represents the smallest measure of text processed and produced by the model. In this model, greater efficiency and greater use are only as different as money. Both increase overall spending. There have been a number of big tech firms who have been actively trying to get people to consume more tokens. Amazon has been urging employees to “tokenmaxx”, which refers to the use of as many AI tokens as possible. To track the level of AI usage among employees, at Meta, one employee developed a tracking system called Claudeonomics.
The prospect of a new generation of AI agents, which can execute a sequence of tasks in a single application instead of answering a single question, will only make things tougher. Goldman Sachs predicts that by 2030, token use will grow by 20 times to 120 quadrillion monthly as enterprises scale up the implementation of AI agents. Those tokens are expected to go down in price. AI service costs will drop by almost 90 percent by 2030, according to Gartner, as running inference on a one trillion parameter LLM costs will become almost 90 percent cheaper than in 2025. However, Gartner warned that this price drop won’t necessarily result in reduced enterprise bills. Indeed, the more tokens are available, the more a company is likely to use them, and frequently, the savings are more or less null.
At major companies like Uber, Microsoft, and others, what we’re seeing is a very subtle suggestion that the economic business model of AI is far from optimized for actual workplace use. Early AI coding tool stories suggested that the machines would render human coding skills obsolete and would be far less expensive. However, things have been more complicated in reality. Human employees have a fixed wage. When the bill comes in at 1000s of employees encouraged to tokenmaxx, it explodes in a way that traditional software licensing never has.
This has a debate on both sides. The advantages of AI coding tools are genuine and quantifiable, as proponents note: productivity boosts, quicker debugging, instant documentation, and the alleviated cognitive burden on engineers. A developer that could have spent hours on a regular function can now create it in seconds. But from that point of view, it’s not inefficient to pay per token. It’s just a different pricing model that encourages users to use the software, not to pay for a seat licence. But critics say the pricing scheme has unhelpful side effects. Businesses want employees to leverage AI more, workers do, and spending ramps up with no end in sight. Worse, since the cost of the token depends on how much it is used, there’s no natural disincentive to overuse the token. Humans don’t get tired and need rest, but an AI model will keep producing tokens as long as there is a person typing.



