| Provider | Model | Monthly cost | vs cheapest |
|---|---|---|---|
| OpenAI | — | $0/mo | — |
| Anthropic | — | $0/mo | — |
| Gemini 2.5 Pro | $0/mo | — |
List-price math only. Prompt caching and batch discounts can shift the verdict — workload-specific. The deep audit models your real bill with discounts included.
| Model | Input ($/MTok) | Output ($/MTok) | Tier |
|---|---|---|---|
| GPT-4o-mini | $0.15 | $0.60 | Cheap, classification |
| Claude Haiku 4.5 | $1.00 | $5.00 | Cheap, structured extraction |
| Gemini 2.5 Pro | $1.25 | $5.00 | Mid, huge context |
| GPT-4o | $2.50 | $10.00 | Mid, general agent |
| Claude Sonnet 4 | $3.00 | $15.00 | Mid, general agent |
| o1-mini | $3.00 | $12.00 | Mid, reasoning |
| o1 | $15.00 | $60.00 | Top, hard reasoning |
| Claude Opus 4.1 | $15.00 | $75.00 | Top, hard reasoning |
One-page provider comparison + one-page "which model for which workload" decision tree. The cheat-sheet most engineering teams actually pin to the wall. PDF sent to your inbox.
For each model: monthly_cost = (input_tokens × input_rate / 1,000,000) + (output_tokens × output_rate / 1,000,000)
Example: 10 million input + 2 million output tokens compared across the three providers' mid-tier models —
For this workload, GPT-4o is 33% cheaper than Sonnet, and Gemini 2.5 Pro is 63% cheaper than GPT-4o. The verdict flips with different input/output ratios — output-heavy workloads tilt toward whichever provider has cheaper output tokens at your tier.
It depends on which models. Cheap tier: GPT-4o-mini ($0.15/$0.60) is cheaper than Claude Haiku 4.5 ($1/$5). Mid tier: GPT-4o ($2.50/$10) is roughly comparable to Claude Sonnet 4 ($3/$15). Top tier: Claude Opus 4.1 ($15/$75) is more expensive than o1. The right answer is workload-specific — run your real input/output token ratio through this calculator.
Yes, until cost dominates. For most 2026 production agents, Sonnet 4 and GPT-4o are interchangeable on common workloads — the cheaper call wins. The cheap-tier models are NOT interchangeable: Haiku is better at structured extraction with long context, GPT-4o-mini wins on raw speed and price. Test on your own evals first.
Because the OpenAI-vs-Anthropic frame leaves money on the table. Gemini 2.5 Pro is competitive at the mid tier and serious for high-context workloads (2M token context, 10x the others). We don't recommend it for every workload, but a 3-way comparison gives you the floor on what "best price for this workload" actually means.
Often yes. Cost-optimized 2026 stacks use 2-3 providers: GPT-4o-mini or Haiku 4.5 for cheap high-volume tasks, Sonnet 4 or GPT-4o for the main agent loop, Opus 4.1 or o1 only for the hardest reasoning. The infrastructure cost of multi-provider routing (LiteLLM, OpenRouter) is rounding error against the 30-60% bill reduction it typically delivers.
No. Token volumes, model picks, and rate math run locally in your browser. The page fires an anonymous pageview beacon and CTA-click events — no inputs, no email (unless you submit one to the cheat-sheet form), no raw IP stored.