AI API pricing spans three orders of magnitude — from $0.01 per million tokens to over $75 per million tokens. DeepSeek V3 and Gemini 2.0 Flash Lite offer the best price-performance ratio; GPT-4o mini and Claude 3.5 Haiku are the best value from major US providers.

Understanding AI API Pricing

AI APIs charge separately for input tokens (the text you send) and output tokens (the text the model generates). Output tokens typically cost 3–5x more than input tokens because they require more computation. The pricing structure means that tasks requiring long responses are relatively more expensive than tasks requiring long inputs.

Full Pricing Matrix (April 2026)

Model	Provider	Input ($/M tokens)	Output ($/M tokens)	Value Score*
Gemini 2.0 Flash Lite	Google	$0.075	$0.30	⭐⭐⭐⭐⭐
DeepSeek V3	DeepSeek AI	$0.27	$1.10	⭐⭐⭐⭐⭐
Gemini 2.0 Flash	Google	$0.10	$0.40	⭐⭐⭐⭐⭐
GPT-4o mini	OpenAI	$0.15	$0.60	⭐⭐⭐⭐
Claude 3.5 Haiku	Anthropic	$0.80	$4.00	⭐⭐⭐⭐
Mistral Small	Mistral AI	$0.20	$0.60	⭐⭐⭐⭐
Llama 3.3 70B (Together.ai)	Meta / Together	$0.60	$0.90	⭐⭐⭐⭐
DeepSeek R1	DeepSeek AI	$0.55	$2.19	⭐⭐⭐⭐ (reasoning)
Mistral Large 2	Mistral AI	$2.00	$6.00	⭐⭐⭐
GPT-4o	OpenAI	$2.50	$10.00	⭐⭐⭐
Claude 3.5 Sonnet	Anthropic	$3.00	$15.00	⭐⭐⭐
Llama 4 Maverick (Together.ai)	Meta / Together	$0.27	$0.85	⭐⭐⭐⭐
Gemini 2.0 Pro	Google	$10.00	$30.00	⭐⭐ (specialty only)
Claude 4 Opus	Anthropic	$15.00	$75.00	⭐⭐ (specialty only)
o3	OpenAI	$10.00	$40.00	⭐⭐ (reasoning specialty)
o4-mini	OpenAI	$1.10	$4.40	⭐⭐⭐⭐ (reasoning)
GPT-5	OpenAI	$7.50	$30.00	⭐⭐⭐ (frontier)

*Value Score = capability relative to cost. ⭐⭐⭐⭐⭐ = exceptional value; ⭐ = expensive relative to capability.

The Best Value Models by Use Case

Best for High-Volume Text Processing

Gemini 2.0 Flash Lite ($0.075/M input) is the cheapest capable model available. At this price, processing 1 billion tokens costs $75 — affordable for large-scale applications. Quality is below frontier models but sufficient for categorization, extraction, and simple generation tasks.

Best for Frontier Quality at Low Cost

DeepSeek V3 ($0.27/M input) achieves near-GPT-4o performance at one-ninth the price. For applications needing high quality without requiring the absolute frontier, DeepSeek V3 offers the best quality-to-cost ratio. Caveat: Chinese origin and data sovereignty considerations apply.

Best Value from a US Provider

GPT-4o mini ($0.15/M input) or Claude 3.5 Haiku ($0.80/M input). GPT-4o mini is cheaper but slightly lower quality. Claude Haiku is 5x more expensive than GPT-4o mini but maintains Anthropic's safety and reliability characteristics.

Best for Reasoning Tasks

o4-mini ($1.10/M input) offers near-o3 reasoning capability at 10x lower cost. For math, logic, and complex coding tasks where you need reasoning model quality but can't justify o3 pricing, o4-mini is the best option.

Practical Cost Calculations

To ground these numbers: processing a typical 1,000-word document produces approximately 1,300 input tokens and 300 output tokens:

Model	Cost per 1,000-word query	Cost per 1M queries
Gemini 2.0 Flash Lite	$0.000188	$188
DeepSeek V3	$0.000681	$681
GPT-4o mini	$0.000375	$375
GPT-4o	$0.00625	$6,250
Claude 3.5 Sonnet	$0.00840	$8,400
Claude 4 Opus	$0.0420	$42,000

Pricing Gotchas to Know

Caching discounts: OpenAI and Anthropic offer prompt caching — if you send the same system prompt repeatedly, cached tokens cost 50–90% less. At scale, this significantly reduces costs for applications with fixed system prompts.
Batch API discounts: Some providers offer 50% discounts for batch (non-real-time) processing. If you can tolerate 24-hour turnaround, batch is substantially cheaper.
Thinking tokens: Reasoning models generate thinking tokens that also cost money, even if not shown to users. o3 and DeepSeek R1 can generate many thinking tokens on hard problems — actual cost may be much higher than list price suggests.
Different models for different tasks: Production applications often use cheap models for simple tasks and expensive models for complex ones. This "model tiering" can reduce overall costs by 60–80%.

Frequently Asked Questions

What's the cheapest AI API that's still good?

Gemini 2.0 Flash and GPT-4o mini offer the best quality-to-cost ratio among US providers. DeepSeek V3 offers near-frontier quality at $0.27/M tokens if you're comfortable with its Chinese origin.

Is paying for a subscription better than API access?

Depends on volume. ChatGPT Plus ($20/month) and Claude Pro ($20/month) are consumer subscriptions with generous quotas. If you're processing large volumes of text programmatically, API pricing becomes more economical above a few hundred queries per day.

Do prices change frequently?

AI API prices have generally decreased over time — sometimes dramatically. GPT-4-class capability that cost $60/M tokens in 2023 costs $2.50/M in 2025. Prices for established models tend to decrease; new frontier models launch at premium prices.

How can I estimate my monthly API costs?

Multiply your estimated monthly token volume (input + output) by the per-million-token cost. Add 20% buffer for variance. Most providers offer usage dashboards and alerts to help track costs in real time.

Cheapest AI APIs in 2025: Full Price and Value Comparison