I spent weeks compiling pricing for every major LLM model while building a cost tracking tool. Here's the complete breakdown—50+ models across OpenAI, Anthropic, Google, Mistral, and more.

Most developers guess their AI costs. They pick a model, ship to production, and hope for the best. But the price difference between models is staggering—o1-pro costs 1,500x more than GPT-5-nano for output tokens.

This guide covers every major model as of February 2026, with real pricing data and practical recommendations.

How I Got This Data

I compiled this while building Orbit, an LLM cost tracking tool. All prices are from official API documentation, updated February 2026.

The Complete Pricing Table

Prices are per 1 million tokens. Most models charge differently for input (what you send) and output (what the model generates).

OpenAI Models

GPT-5 Series (Latest)

Model	Input/1M	Output/1M	Best For
GPT-5.2	$1.75	$14.00	Most capable, latest
GPT-5.2-pro	$21.00	$168.00	Maximum quality
GPT-5	$1.25	$10.00	General purpose
GPT-5-pro	$15.00	$120.00	Complex reasoning
GPT-5-mini	$0.25	$2.00	Cost-effective
GPT-5-nano	$0.05	$0.40	High volume, simple tasks

GPT-4.1 Series

Model	Input/1M	Output/1M	Best For
GPT-4.1	$2.00	$8.00	Balanced performance
GPT-4.1-mini	$0.40	$1.60	Budget option
GPT-4.1-nano	$0.10	$0.40	Ultra-low cost

GPT-4o Series

Model	Input/1M	Output/1M	Best For
GPT-4o	$2.50	$10.00	Multimodal, production
GPT-4o-mini	$0.15	$0.60	Fast, cheap

O-Series (Reasoning Models)

Model	Input/1M	Output/1M	Best For
o4-mini	$1.10	$4.40	Latest reasoning, efficient
o3	$2.00	$8.00	Advanced reasoning
o3-pro	$20.00	$80.00	Maximum reasoning power
o3-mini	$1.10	$4.40	Cost-effective reasoning
o1	$15.00	$60.00	Complex problem-solving
o1-pro	$150.00	$600.00	Research-grade reasoning
o1-mini	$1.10	$4.40	Reasoning on a budget

Hidden Costs with Reasoning Models

O-series models generate internal "thinking" tokens you pay for but don't see. A simple query can use 10x more tokens than expected. Always monitor actual usage.

Anthropic Models

Claude 4.5 (Latest)

Model	Input/1M	Output/1M	Best For
Claude 4.5 Opus	$5.00	$25.00	Most capable Claude
Claude 4.5 Sonnet	$3.00	$15.00	Balanced performance
Claude 4.5 Haiku	$1.00	$5.00	Fast, cost-effective

Claude 4 Series

Model	Input/1M	Output/1M	Best For
Claude 4 Opus	$15.00	$75.00	Complex tasks
Claude 4 Sonnet	$3.00	$15.00	Production workloads

Claude 3.x Series

Model	Input/1M	Output/1M	Best For
Claude 3.7 Sonnet	$3.00	$15.00	Coding, analysis
Claude 3.5 Sonnet	$3.00	$15.00	General purpose
Claude 3.5 Haiku	$1.00	$5.00	Fast responses
Claude 3 Haiku	$0.25	$1.25	Cheapest Claude

Google Gemini Models

Gemini 3 (Latest)

Model	Input/1M	Output/1M	Best For
Gemini 3 Pro	$2.00	$12.00	Latest, most capable
Gemini 3 Flash	$0.50	$3.00	Fast, efficient

Gemini 2.5

Model	Input/1M	Output/1M	Best For
Gemini 2.5 Pro	$1.25	$10.00	Complex tasks
Gemini 2.5 Flash	$0.30	$2.50	Balanced
Gemini 2.5 Flash Lite	$0.10	$0.40	Ultra-low cost

Gemini 2.0 (Deprecating March 2026)

Model	Input/1M	Output/1M	Best For
Gemini 2.0 Flash	$0.10	$0.40	Cheapest option (for now)
Gemini 2.0 Flash Lite	$0.075	$0.30	Absolute minimum cost

Mistral Models

Model	Input/1M	Output/1M	Best For
Mistral Large	$2.00	$6.00	Complex reasoning
Mistral Small	$0.20	$0.60	Fast, efficient
Codestral	$0.20	$0.60	Code generation
Ministral 8B	$0.10	$0.10	Edge deployment
Ministral 3B	$0.04	$0.04	Ultra-light

Surprising Findings

1. The Price Range is Insane

Output tokens range from $0.04/1M (Ministral 3B) to $600/1M (o1-pro). That's a 15,000x difference.

$0.04

Cheapest (Ministral 3B)

$600

Most Expensive (o1-pro)

2. The "Mini" Model Wars

Every provider now has a mini model competing for the budget tier:

GPT-5-nano: $0.05 input / $0.40 output
Gemini 2.5 Flash Lite: $0.10 input / $0.40 output
Ministral 3B: $0.04 input / $0.04 output
GPT-4o-mini: $0.15 input / $0.60 output

For simple classification and extraction, these models are practically free at scale.

3. Reasoning Models are 10-100x More Expensive

The o-series models (o1, o3, o4) cost significantly more—and that's just the sticker price. They also generate hidden "thinking" tokens that inflate your actual costs.

4. Claude 4.5 Opus is Surprisingly Cheap

At $5/1M input and $25/1M output, Claude 4.5 Opus is cheaper than Claude 4 Opus ($15/$75). Anthropic is getting more aggressive on pricing.

5. Google Offers the Best Value for High-Volume

Gemini 2.0 Flash at $0.10/$0.40 is hard to beat for volume. And Gemini 2.5 Flash Lite at $0.10/$0.40 maintains quality while staying cheap.

Real Cost Comparisons

Let's calculate actual costs for common tasks:

Customer Support Bot (1M conversations/month)

Average: 500 input tokens, 300 output tokens per conversation

Model	Monthly Cost
GPT-5-nano	$145
GPT-4o-mini	$255
Claude 3 Haiku	$500
GPT-5	$3,625
Claude 4.5 Opus	$10,000
o1	$25,500

Cost Savings Tip

For customer support, GPT-5-nano or GPT-4o-mini handles 90% of queries at 1/100th the cost of premium models. Route complex queries to better models.

Document Summarization (100K docs/month)

Average: 4,000 input tokens, 500 output tokens per document

Model	Monthly Cost
Gemini 2.0 Flash	$60
GPT-5-mini	$200
Claude 3.5 Haiku	$650
GPT-5	$1,000
Claude 4.5 Opus	$3,250

AI Agent (50 steps per task, 10K tasks/month)

Average: 1,000 input tokens, 500 output tokens per step

Model	Monthly Cost
GPT-4o-mini	$225
o3-mini	$1,650
Claude 4.5 Sonnet	$5,250
o3	$3,000
o1	$22,500

My Recommendations

For Startups (Cost-Sensitive)

Default to GPT-5-nano or GPT-4o-mini — Handle 80% of tasks
Use Gemini 2.0/2.5 Flash for volume — Best price/performance
Route complex queries to GPT-5 or Claude 4.5 Sonnet
Avoid reasoning models unless necessary — 10x+ cost increase

For Enterprise (Quality-Focused)

Claude 4.5 Opus or GPT-5.2 for critical tasks
Claude 4.5 Sonnet or GPT-5 for general production
o3-mini for reasoning tasks — Good balance of capability and cost
Track everything — Know your cost-per-feature

For AI Agents

Start with GPT-4o-mini — Test your agent logic cheaply
Use o3-mini for reasoning steps — Not o1 or o3-pro
Batch and cache aggressively — Agents make many similar calls
Set per-task budgets — Runaway agents can cost hundreds per task

How to Track All This

Knowing prices is one thing. Tracking actual costs in production is another. Provider dashboards show totals, but not which features drive costs.

I built Orbit to solve this. One-line SDK integration, and you get per-feature cost breakdowns across all providers:

import { Orbit } from '@with-orbit/sdk';
import OpenAI from 'openai';
import Anthropic from '@anthropic-ai/sdk';

const orbit = new Orbit({ apiKey: process.env.ORBIT_API_KEY });

// Track OpenAI costs by feature
const openai = orbit.wrapOpenAI(new OpenAI(), {
  feature: 'chat-assistant'
});

// Track Anthropic costs by feature
const anthropic = orbit.wrapAnthropic(new Anthropic(), {
  feature: 'document-analysis'
});

// All calls automatically tracked with cost, tokens, latency

Track LLM Costs Across All Providers

Orbit gives you real-time visibility into costs across OpenAI, Anthropic, Google, and more. See spending by feature, model, and environment.

50+ models supported
Per-feature cost tracking
Free tier: 10,000 events/month

Start tracking free

I Calculated What 1M Tokens Costs Across 50+ LLM Models