The Unsustainable Economics of AI Pricing: Tokens vs. Outcomes

The Unsustainable Economics of AI Pricing: Tokens vs. Outcomes

Mike Okner Mike Okner

The AI industry has a looming pricing problem that is starting to get more attention. Most AI products are priced per token or per API call, but the cost structure doesn’t align with value delivery, and it’s not sustainable.

The Hidden Subsidy Problem

Consider a typical AI-powered development tool like Cursor or Claude Code. A user might consume $150-200 worth of API costs per month through heavy model usage while only paying a subscription fee of $20 per month.

The math doesn’t work.

This isn’t unique to developer tools. Perplexity AI offers unlimited Pro searches at $20/month, despite inference costs that can easily exceed this for power users. ChatGPT Plus charges $20/month for extended GPT-5 access, when API costs for equivalent usage would run over $100. GitHub Copilot is just $10/month for unlimited code completions, all backed by expensive model calls.

According to TechCrunch’s analysis of AI coding startups, margins on code generation products are either neutral or negative. For many startups burning through tokens, every power user represents a loss. This is fundamentally different from traditional SaaS, where serving additional users costs almost nothing and margins improve with scale.

The Uber Playbook

This should sound familiar. It’s the same playbook Uber and DoorDash used:

  1. Launch Phase: Subsidize rides/deliveries to acquire customers and market share
  2. Growth Phase: Continue subsidizing to maintain competitive advantage and undercut competitors
  3. Reckoning Phase: Raise prices to achieve unit economics that actually work

Uber lost $8.5 billion in 2019 alone. DoorDash burned through $667 million in 2019. Eventually, prices went up significantly. According to data from Rakuten Intelligence, Uber and Lyft prices increased by 92% between January 2018 and July 2021.

The AI industry is in phase 2 right now. VC funding is subsidizing inference costs to grab market share. Similar to ride-sharing, there’s only so much room for operational costs to be optimized. Like ride-sharing driver costs, AI inference costs are largely fixed and scale up linearly as customer usage scales up. Inference costs move at the pace of GPU and model improvements, not operational optimization.

It’s worth noting that there has been some contention that unit economics will improve as models get smarter and GPU performance continues to increase, however this assumption relies on the premise that usage patterns remain fixed. So far, that has not bourne out. As GPU performance has increased, models have become more sophisticated. As models become more sophisticated, users leverage them for increasingly complex tasks. It’s not clear that progress will result in better margins.

Why Token Pricing Breaks Down

Token-based pricing has fundamental issues that go beyond simple economics. First, there’s a basic value misalignment: a 10,000-token response isn’t necessarily 10x more valuable than a 1,000-token response. Sometimes it’s actually worse, like when you get verbose outputs that could have been concise. You end up paying for inefficiency.

The unpredictability is another major problem. Users can’t estimate their monthly bill. Will that refactoring task cost $5 or $50? The uncertainty creates budget anxiety and usage friction. According to Gartner’s 2024 survey, difficulty estimating and demonstrating the value of AI projects is the primary obstacle to AI adoption, reported by 49% of participants as the top barrier.

Then there’s the adverse selection problem. Power users who extract the most value also cost the most to serve. Under flat subscription pricing, you’re losing money on your most engaged customers, which is the exact opposite of a healthy SaaS model. This creates a perverse incentive where you either need to cap usage (frustrating your best customers) or bleed money on every power user.

There’s also an inherent optimization conflict. You want the AI to use multi-step reasoning, tool calling, and thorough analysis. But each step costs money. There’s a constant tension between quality and cost that forces uncomfortable tradeoffs.

In their analysis “AI’s $600B Question,” Sequoia Capital highlights a massive revenue gap in the AI industry. They calculate that AI companies need to generate $600 billion in annual revenue to justify current infrastructure spending, but the revenue gap has expanded to $500 billion. The fundamental question remains: where is all the revenue?

The Coming Price Correction

Anthropic, Cursor (Anysphere), and Replit have all already started telegraphing this shift with their coding tools. They individually recently announced stricter rate-limits due to extreme usage from a minority of users. This is the future we’re heading toward, where prices come to reflect actual computational cost rather than being artificially suppressed to gain market share.

When VC subsidies dry up (and history tells us they will), we’ll likely see subscription prices increase 5-10x, a move away from truly “unlimited” tiers, more aggressive tiering and throttling, and startups either shutting down or pivoting to different models. Especially outside of a few major players, the companies that survive will be those that found a different pricing model before they were forced to.

Outcome-Based Pricing: A Different Approach

What if instead of paying for tokens, you paid for outcomes?

This isn’t theoretical. We’re already seeing early examples in adjacent domains. In software development, some tools (like us at Zaurus) are moving towards a model of charging per PR created or per bug fixed, rather than per API call for code generation. Decagon, a rapidly-growing AI customer support solution, prices based on number of conversations rather than the specific size/content of each conversation, leading to better predictability and aligned incentives.

The economics in this model are fundamentally different. Instead of:

Revenue = Subscriptions
Cost = API_calls × cost_per_token
Margin = Revenue - Cost (often negative)

You have:

Revenue = Outcomes_delivered × price_per_outcome
Cost = (API_calls_per_outcome × cost_per_token) + overhead
Margin = Revenue - Cost (positive if priced correctly)

Why Outcomes Align Incentives

Outcome-based pricing solves several problems simultaneously. The most obvious is predictable value exchange. The customer knows exactly what they’re paying for. If an automated cloud optimization saves $5,000/year, paying a fixed amount for that outcome is a straightforward ROI decision. Paying for an unknown number of tokens to maybe achieve that outcome is not.

It also changes the incentive structure around quality. You’re incentivized to deliver results efficiently, not to maximize token generation. A concise, accurate solution is better than a verbose, meandering one. This aligns the provider’s interests with the customer’s in a way that token pricing simply doesn’t.

From a business perspective, outcome pricing enables sustainable unit economics. You can optimize your inference strategy over time by using smaller models when possible, caching aggressively, and implementing early stopping. The customer doesn’t care about any of that because they’re paying for the outcome. This gives you room to improve margins without changing what the customer sees.

Perhaps most importantly, outcome pricing enables natural market pricing. Prices can reflect value delivered rather than arbitrary compute costs. A $10,000 infrastructure optimization is genuinely worth more than a $100 one. Token pricing treats them as equivalent if they consume the same compute, which makes no economic sense.

The Implementation Challenge

Outcome-based pricing isn’t trivial to implement. You need clear success criteria that define what constitutes a “delivered outcome” and how to measure it. How do you handle partial success? What happens when an optimization saves $3,000 instead of $5,000?

You also need quality guarantees. If you’re charging per outcome, you need high confidence in delivery. This requires extensive evals/testing, ongoing validation, and potentially human-in-the-loop verification before you can reliably charge customers for results rather than effort.

Operational efficiency becomes critical. Your inference costs need to be well below your outcome price, which means investing in model optimization, caching, and smart orchestration. You’re betting that you can deliver outcomes more efficiently over time, even if inference costs don’t improve.

Finally, you need customer trust. Users need to believe they’re getting fair value, which requires transparent reporting and clear outcome definitions. This is particularly challenging in AI, where the non-deterministic nature of LLMs leads to variance in outcome with identical inputs.

What This Means for AI Products

Not every AI product can use outcome-based pricing. Conversational interfaces, creative tools, and open-ended exploration don’t have clear “outcomes.” For these products, token pricing or usage-based models may be the only viable option. That’s fine.

But for AI products that solve specific, measurable problems (code generation, data analysis, content optimization, infrastructure automation), outcome-based pricing could be the difference between building a sustainable business and running a VC-subsidized burn rate until the money runs out.

The question isn’t whether AI pricing will change. It will. The subsidies can’t last forever, and when they end, the market will correct sharply. The real question is which companies will figure out sustainable models before they’re forced to, and which will scramble to raise prices on users who’ve grown accustomed to artificially cheap AI.

The Path Forward

If you’re building an AI product, you need to answer some uncomfortable questions. What problem are you actually solving? Not “provide AI assistance,” but what specific outcome does the user want? Can you measure success in a way that’s clear and quantifiable? Can you deliver that outcome profitably, with enough margin between your inference costs and the price customers will pay to sustain a business? And crucially, does the user prefer paying per outcome, or would they rather pay a flat monthly fee despite the usage anxiety?

These aren’t easy questions, and the answers will vary by product and market. But they’re worth wrestling with now, before the market forces your hand.

The subsidy era won’t last forever. The AI solutions that will dominate in five years will be those that figured out how to align pricing with value delivery, long before the rest of the market was forced to. They’ll be the companies that built sustainable unit economics from day one, rather than relying on subsidies to paper over fundamental business model problems.

Ready to Optimize Your Cloud Costs?

Automatically turn cloud cost findings into Terraform pull requests and realize the long-tail of savings opportunities.

Get Started Today