Updated March 2026

AI Model Pricing
Comparison 2026

Compare pricing across 56+ models from 20 providers. Sort by cost, context window, or capabilities. Use the cost calculator to estimate your monthly spend.

Models

Providers

$0.05/M

Cheapest input

Free models

Showing 56 of 56 models

Prices in USD per 1M tokens

Model	Provider	Category	Input / 1M	Output / 1M	Context	Vision	Tools
glm-4.7-flash	Z.AI	free	Free	Free	128K	--		Use
glm-4.5-flash	Z.AI	free	Free	Free	128K	--		Use
gpt-5-nano	OpenAI	budget	$0.05	$0.40	128K	--		Use
llama-3.1-8b	Groq	budget	$0.05	$0.08	128K	--		Use
gemini-2.0-flash	Google	budget	$0.10	$0.40	1M			Use
mistral-small-4	Mistral	budget	$0.10	$0.30	128K	--		Use
step-3.5-flash	StepFun	budget	$0.10	$0.30	128K	--		Use
gpt-4o-mini	OpenAI	budget	$0.15	$0.60	128K			Use
gemini-2.5-flash	Google	budget	$0.15	$0.60	1M			Use
command-r	Cohere	budget	$0.15	$0.60	128K	--		Use
gpt-5.4-nano	OpenAI	budget	$0.20	$1.25	128K			Use
grok-4.1-fast	xAI	budget	$0.20	$0.50	131K	--		Use
llama-3.1-8b (Fireworks)	Fireworks	budget	$0.20	$0.20	128K	--		Use
jamba-1.5-mini	AI21	budget	$0.20	$0.40	256K	--	--	Use
mixtral-8x7b	Groq	budget	$0.24	$0.24	33K	--		Use
gpt-5-mini	OpenAI	budget	$0.25	$2.00	128K			Use
gemini-3.1-flash-lite	Google	budget	$0.25	$1.50	1M			Use
MiniMax-M2	MiniMax	budget	$0.26	$1.00	245K	--		Use
deepseek-chat	DeepSeek	budget	$0.27	$1.10	128K	--		Use
codestral	Mistral	code	$0.30	$0.90	256K	--	--	Use
MiniMax-M2.5	MiniMax	budget	$0.30	$1.20	1M			Use
qwen-turbo	Qwen	budget	$0.30	$0.60	128K	--		Use
llama-3.1-70b (DeepInfra)	DeepInfra	standard	$0.52	$0.75	128K	--		Use
deepseek-reasoner	DeepSeek	reasoning	$0.55	$2.19	128K	--	--	Use
llama-3.3-70b	Groq	standard	$0.59	$0.79	128K	--		Use
kimi-k2.5	Moonshot	standard	$0.60	$2.50	256K			Use
gpt-5.4-mini	OpenAI	standard	$0.75	$4.50	256K			Use
claude-haiku-3.5	Anthropic	budget	$0.80	$4.00	200K			Use
llama-3.3-70b (Cerebras)	Cerebras	standard	$0.85	$1.20	128K	--		Use
llama-3.1-70b (Together)	Together	standard	$0.88	$0.88	128K	--		Use
sonar	Perplexity	search	$1.00	$5.00	128K	--	--	Use
sonar-reasoning	Perplexity	reasoning	$1.00	$5.00	128K	--	--	Use
glm-5	Z.AI	standard	$1.00	$3.20	128K			Use
o4-mini	OpenAI	reasoning	$1.10	$4.40	200K			Use
o3-mini	OpenAI	reasoning	$1.10	$4.40	200K	--		Use
gpt-5.1	OpenAI	frontier	$1.25	$10.00	256K			Use
gemini-2.5-pro	Google	standard	$1.25	$10.00	1M			Use
qwen-max	Qwen	standard	$1.60	$6.40	128K			Use
gemini-3.1-pro	Google	frontier	$2.00	$12.00	2M			Use
mistral-large	Mistral	standard	$2.00	$6.00	128K	--		Use
sonar-deep-research	Perplexity	search	$2.00	$8.00	128K	--	--	Use
sonar-reasoning-pro	Perplexity	reasoning	$2.00	$8.00	128K	--	--	Use
jamba-1.5-large	AI21	standard	$2.00	$8.00	256K	--	--	Use
gpt-5.4	OpenAI	frontier	$2.50	$15.00	256K			Use
gpt-4o	OpenAI	standard	$2.50	$10.00	128K			Use
command-r-plus	Cohere	standard	$2.50	$10.00	128K	--		Use
claude-sonnet-4	Anthropic	standard	$3.00	$15.00	200K			Use
claude-sonnet-4.5	Anthropic	standard	$3.00	$15.00	200K			Use
grok-4	xAI	frontier	$3.00	$15.00	256K			Use
grok-3	xAI	standard	$3.00	$15.00	131K			Use
sonar-pro	Perplexity	search	$3.00	$15.00	200K	--	--	Use
llama-3.1-405b (Fireworks)	Fireworks	frontier	$3.00	$3.00	128K	--		Use
llama-3.1-405b (Together)	Together	frontier	$3.50	$3.50	128K	--		Use
llama-3.1-405b (SambaNova)	SambaNova	frontier	$5.00	$10.00	128K	--		Use
o3	OpenAI	reasoning	$10.00	$40.00	200K			Use
claude-opus-4	Anthropic	frontier	$15.00	$75.00	200K			Use

glm-4.7-flash

Z.AI

free

Input / 1M

Free

Output / 1M

Free

Context

128K

Features

Use through Curate-Me

glm-4.5-flash

Z.AI

free

Input / 1M

Free

Output / 1M

Free

Context

128K

Features

Use through Curate-Me

gpt-5-nano

OpenAI

budget

Input / 1M

$0.05

Output / 1M

$0.40

Context

128K

Features

Use through Curate-Me

llama-3.1-8b

Groq

budget

Input / 1M

$0.05

Output / 1M

$0.08

Context

128K

Features

Use through Curate-Me

gemini-2.0-flash

Google

budget

Input / 1M

$0.10

Output / 1M

$0.40

Context

Features

Use through Curate-Me

mistral-small-4

Mistral

budget

Input / 1M

$0.10

Output / 1M

$0.30

Context

128K

Features

Use through Curate-Me

step-3.5-flash

StepFun

budget

Input / 1M

$0.10

Output / 1M

$0.30

Context

128K

Features

Use through Curate-Me

gpt-4o-mini

OpenAI

budget

Input / 1M

$0.15

Output / 1M

$0.60

Context

128K

Features

Use through Curate-Me

gemini-2.5-flash

Google

budget

Input / 1M

$0.15

Output / 1M

$0.60

Context

Features

Use through Curate-Me

command-r

Cohere

budget

Input / 1M

$0.15

Output / 1M

$0.60

Context

128K

Features

Use through Curate-Me

gpt-5.4-nano

OpenAI

budget

Input / 1M

$0.20

Output / 1M

$1.25

Context

128K

Features

Use through Curate-Me

grok-4.1-fast

xAI

budget

Input / 1M

$0.20

Output / 1M

$0.50

Context

131K

Features

Use through Curate-Me

llama-3.1-8b (Fireworks)

Fireworks

budget

Input / 1M

$0.20

Output / 1M

$0.20

Context

128K

Features

Use through Curate-Me

jamba-1.5-mini

AI21

budget

Input / 1M

$0.20

Output / 1M

$0.40

Context

256K

Features

Use through Curate-Me

mixtral-8x7b

Groq

budget

Input / 1M

$0.24

Output / 1M

$0.24

Context

33K

Features

Use through Curate-Me

gpt-5-mini

OpenAI

budget

Input / 1M

$0.25

Output / 1M

$2.00

Context

128K

Features

Use through Curate-Me

gemini-3.1-flash-lite

Google

budget

Input / 1M

$0.25

Output / 1M

$1.50

Context

Features

Use through Curate-Me

MiniMax-M2

MiniMax

budget

Input / 1M

$0.26

Output / 1M

$1.00

Context

245K

Features

Use through Curate-Me

deepseek-chat

DeepSeek

budget

Input / 1M

$0.27

Output / 1M

$1.10

Context

128K

Features

Use through Curate-Me

codestral

Mistral

code

Input / 1M

$0.30

Output / 1M

$0.90

Context

256K

Features

Use through Curate-Me

MiniMax-M2.5

MiniMax

budget

Input / 1M

$0.30

Output / 1M

$1.20

Context

Features

Use through Curate-Me

qwen-turbo

Qwen

budget

Input / 1M

$0.30

Output / 1M

$0.60

Context

128K

Features

Use through Curate-Me

llama-3.1-70b (DeepInfra)

DeepInfra

standard

Input / 1M

$0.52

Output / 1M

$0.75

Context

128K

Features

Use through Curate-Me

deepseek-reasoner

DeepSeek

reasoning

Input / 1M

$0.55

Output / 1M

$2.19

Context

128K

Features

Use through Curate-Me

llama-3.3-70b

Groq

standard

Input / 1M

$0.59

Output / 1M

$0.79

Context

128K

Features

Use through Curate-Me

kimi-k2.5

Moonshot

standard

Input / 1M

$0.60

Output / 1M

$2.50

Context

256K

Features

Use through Curate-Me

gpt-5.4-mini

OpenAI

standard

Input / 1M

$0.75

Output / 1M

$4.50

Context

256K

Features

Use through Curate-Me

claude-haiku-3.5

Anthropic

budget

Input / 1M

$0.80

Output / 1M

$4.00

Context

200K

Features

Use through Curate-Me

llama-3.3-70b (Cerebras)

Cerebras

standard

Input / 1M

$0.85

Output / 1M

$1.20

Context

128K

Features

Use through Curate-Me

llama-3.1-70b (Together)

Together

standard

Input / 1M

$0.88

Output / 1M

$0.88

Context

128K

Features

Use through Curate-Me

sonar

Perplexity

Input / 1M

$1.00

Output / 1M

$5.00

Context

128K

Features

Use through Curate-Me

sonar-reasoning

Perplexity

reasoning

Input / 1M

$1.00

Output / 1M

$5.00

Context

128K

Features

Use through Curate-Me

glm-5

Z.AI

standard

Input / 1M

$1.00

Output / 1M

$3.20

Context

128K

Features

Use through Curate-Me

o4-mini

OpenAI

reasoning

Input / 1M

$1.10

Output / 1M

$4.40

Context

200K

Features

Use through Curate-Me

o3-mini

OpenAI

reasoning

Input / 1M

$1.10

Output / 1M

$4.40

Context

200K

Features

Use through Curate-Me

gpt-5.1

OpenAI

frontier

Input / 1M

$1.25

Output / 1M

$10.00

Context

256K

Features

Use through Curate-Me

gemini-2.5-pro

Google

standard

Input / 1M

$1.25

Output / 1M

$10.00

Context

Features

Use through Curate-Me

qwen-max

Qwen

standard

Input / 1M

$1.60

Output / 1M

$6.40

Context

128K

Features

Use through Curate-Me

gemini-3.1-pro

Google

frontier

Input / 1M

$2.00

Output / 1M

$12.00

Context

Features

Use through Curate-Me

mistral-large

Mistral

standard

Input / 1M

$2.00

Output / 1M

$6.00

Context

128K

Features

Use through Curate-Me

sonar-deep-research

Perplexity

Input / 1M

$2.00

Output / 1M

$8.00

Context

128K

Features

Use through Curate-Me

sonar-reasoning-pro

Perplexity

reasoning

Input / 1M

$2.00

Output / 1M

$8.00

Context

128K

Features

Use through Curate-Me

jamba-1.5-large

AI21

standard

Input / 1M

$2.00

Output / 1M

$8.00

Context

256K

Features

Use through Curate-Me

gpt-5.4

OpenAI

frontier

Input / 1M

$2.50

Output / 1M

$15.00

Context

256K

Features

Use through Curate-Me

gpt-4o

OpenAI

standard

Input / 1M

$2.50

Output / 1M

$10.00

Context

128K

Features

Use through Curate-Me

command-r-plus

Cohere

standard

Input / 1M

$2.50

Output / 1M

$10.00

Context

128K

Features

Use through Curate-Me

claude-sonnet-4

Anthropic

standard

Input / 1M

$3.00

Output / 1M

$15.00

Context

200K

Features

Use through Curate-Me

claude-sonnet-4.5

Anthropic

standard

Input / 1M

$3.00

Output / 1M

$15.00

Context

200K

Features

Use through Curate-Me

grok-4

xAI

frontier

Input / 1M

$3.00

Output / 1M

$15.00

Context

256K

Features

Use through Curate-Me

grok-3

xAI

standard

Input / 1M

$3.00

Output / 1M

$15.00

Context

131K

Features

Use through Curate-Me

sonar-pro

Perplexity

Input / 1M

$3.00

Output / 1M

$15.00

Context

200K

Features

Use through Curate-Me

llama-3.1-405b (Fireworks)

Fireworks

frontier

Input / 1M

$3.00

Output / 1M

$3.00

Context

128K

Features

Use through Curate-Me

llama-3.1-405b (Together)

Together

frontier

Input / 1M

$3.50

Output / 1M

$3.50

Context

128K

Features

Use through Curate-Me

llama-3.1-405b (SambaNova)

SambaNova

frontier

Input / 1M

$5.00

Output / 1M

$10.00

Context

128K

Features

Use through Curate-Me

OpenAI

reasoning

Input / 1M

$10.00

Output / 1M

$40.00

Context

200K

Features

Use through Curate-Me

claude-opus-4

Anthropic

frontier

Input / 1M

$15.00

Output / 1M

$75.00

Context

200K

Features

Use through Curate-Me

Cost Calculator

Calculate Your Monthly Cost

Select your expected monthly token volume and workload type to see the cheapest models for your use case.

Monthly tokens

10K500K tokens/mo10M

Workload type

Top 5 cheapest models for your workload

llama-3.1-8b

Groq

$0.03

/month

mistral-small-4

Mistral

$0.10

/month

step-3.5-flash

StepFun

$0.10

/month

llama-3.1-8b (Fireworks)

Fireworks

$0.10

/month

gpt-5-nano

OpenAI

$0.11

/month

Start using these models through Curate-Me

Free tier includes 1,000 gateway requests/day. No credit card required.

Access all 56+ models through one gateway

Point your AI SDK at Curate-Me and get cost tracking, personal data scanning, rate limiting, and HITL approvals across every provider. Zero code changes.

# Before (direct to OpenAI):

OPENAI_BASE_URL=https://api.openai.com/v1

# After (through Curate-Me):

OPENAI_BASE_URL=https://api.curate-me.ai/v1/openai

X-CM-API-Key: cm_sk_xxx

Start for Free View Plans

AI Model PricingComparison 2026

Calculate Your Monthly Cost

Top 5 cheapest models for your workload

Access all 56+ models through one gateway

AI Model Pricing
Comparison 2026