Tracking 200x price reduction in LLM APIs from 2021 to 2025 — exponential decay, step drops, and what it means for builders.
GPT-3 Davinci cost $20/1M tokens in 2021. GPT-4.1 nano costs $0.10 in 2025. That's a 200x reduction. In a traditional industry, that happens over decades. In LLM APIs, it happened in four years. This isn't incremental optimization. It's exponential decay.
The Two Patterns
Exponential decay: each generation is cheaper than the last. This follows compute cost curves — Moore's Law applied to inference. Step drops: prices don't decline smoothly. They hold flat, then suddenly drop when a new model releases. GPT-3.5 Turbo went through four price cuts in 12 months ($2 → $1.50 → $1 → $0.50). Each cut was a step function, not a gradient. The pattern is visible in the chart: horizontal lines (price holds) punctuated by vertical drops (new model releases).
The Provider Race
Four providers on the chart tell four different stories. OpenAI pioneered the premium tier (GPT-4 at $30) then raced to the bottom (GPT-4o mini at $0.15). Anthropic entered expensive (Claude 1 at $11) and stayed premium while adding a cheap tier (Haiku at $0.25). Google was the price war catalyst — Gemini 1.5 Flash launched at $0.35, then Google cut it 80% to $0.075. Meta made the open-source argument: release weights for free, let hosted providers compete on price. The result: a market where the floor price dropped from $2 (GPT-3.5 Turbo, March 2023) to $0.075 (Gemini Flash, August 2024) in 17 months.
What We Built
We curated 38 price points from official sources: OpenAI pricing pages (Wayback Machine for historical), Anthropic CDN pricing PDFs, TechCrunch launch articles, Google developer blog posts, Together AI model pages. The chart uses pure SVG — no charting library. Same approach as the Instruction Elasticity experiment. Log-scale Y-axis because prices span three orders of magnitude ($0.075 to $75). Step-function lines show prices holding flat then dropping. Click any point for the model, date, price, and source note. The data lives in a TypeScript file. Adding a new price point is one line of code.
What This Means for Builders
If you're making architecture decisions based on current API prices, you're planning for a world that won't exist in 6 months. The floor price of frontier-quality inference is approaching $0.10/1M tokens. At that price, most cost-optimization patterns (batching, caching, model routing) become engineering overhead that costs more in developer time than it saves in API costs. The question shifts from "how do we reduce AI costs?" to "what can we build now that AI costs nothing?" That's a fundamentally different product conversation. The irony: Project Nothing — literally a product that delivers nothing — now hosts an experiment tracking the price of everything becoming nothing. We didn't plan that metaphor.
Weekly Updates
The chart updates weekly. New price points are added from official provider announcements and pricing-tracking APIs. Data sources: llm-prices.com historical API (by Simon Willison), official provider pricing pages, and launch journalism. The chart will accumulate a longer time series with each passing week. The exponential decay pattern will either continue — or something more interesting will happen.
Experiment Context
- Commit
- 469bece
- Mutation rationale
- feat(experiment): LLM Price Volatility page + gallery card
- Last reviewed
- March 19, 2026