It was in early 2025 when Andrej Karpathy, the influential former OpenAI and Tesla AI leader, coined the term vibe coding.” The phrase described a fundamental shift in software development – coders handing over the basic work to LLM-based AI models while they could focus on deeper aspects. Vibe coding became one of the biggest AI trends to evolve in 2025, with AI firms like Anthropic and Cursor building a fortune on this. Software giants around the world capitalised on it heavily, with famous personalities and tech CEOs encouraging employees to spend more credits, or tokens, on AI for developing a better piece of code, more efficiently and quickly.
Just over a year later, the mood in Silicon Valley has shifted, with the concern now hovering over raw consumption. Enter ‘tokenmaxxing’ — a new term that describes the practice of maximising AI token usage as a proxy for productivity, status, and AI-native prowess. What started as internal encouragement for heavy AI adoption has morphed into leaderboards, hefty budgets.
Now, those very tech giants like Uber and Microsoft are backing off from tokenmaxxing, and the reason is attributed to the rising costs for company annual budgets.
What’s all the hype around tokenmaxxing?
Tokens are the fundamental currency of large language models (LLMs) – roughly three-quarters of a word or four characters of text. Every prompt, response, agent loop, and reasoning step burns them. AI providers like OpenAI bill based on input, output, cached, and reasoning tokens. In 2026, as companies poured resources into agentic AI workflows, some organisations began tracking and celebrating individual and team token consumption openly.
At Meta, an internal ‘Claudeonomics’ dashboard reportedly ranked engineers by tokens used, awarding titles like ‘Token Legend’ to top performers — one of whom reportedly processed hundreds of billions of tokens in a month. Similar dynamics emerged at OpenAI and other firms.
Based on several reports, generous token budgets became perks akin to free meals or gym memberships at big companies, with some engineers spending thousands monthly to run parallel agents and automate their own workflows.
Nvidia’s CEO, Jensen Huang, fueled the fire, suggesting top engineers might consume $250,000 worth of tokens per month. For hardware vendors and AI providers, the trend juices demand. For companies footing the bill, it raises uncomfortable questions.
The pitfalls of tokenmaxxing
Experts argue that tokenmaxxing is a necessary phase. In the early days of any transformative technology, forcing adoption builds muscle memory and uncovers new capabilities. Running autonomous agents 24/7, chaining complex tasks, and maintaining massive context windows can accelerate experimentation. Many engineers now believe that not using AI at all is the biggest risk right now.
Critics, however, have points to make against tokenmaxxing. Data from engineering intelligence platforms shows that while heavy token users may produce more pull requests, the gains don’t scale linearly with cost, sometimes delivering twice the output at ten times the expense. It echoes the old trap of measuring lines of code written instead of working software shipped. Companies like Uber have reportedly blown through budgets without proportional gains in customer-facing features.
Salesforce, HubSpot, and others are pushing back with “outcome maxxing”, prioritising business value, decision quality, and ROI over raw volume. Appian’s CEO likened tokenmaxxing to judging chandeliers by weight under Soviet production quotas. Even Meta reportedly dialled back its public leaderboard amid backlash.
Environmental and financial sustainability add another layer. Massive token consumption drives up inference costs and energy demands at a time when AI’s ROI is under increasing scrutiny. Nature Machine Intelligence recently urged companies to “stop tokenmaxxing and deploy AI sensibly instead.”
What Tokenmaxxing reflects is a broader maturation in AI adoption. After the initial euphoria of vibe coding and rapid prototyping, organisations are grappling with measurement. While tokens are easy to count, factors like true productivity, impact on revenue, or product quality remain stubbornly difficult to measure.
