List-price math only. Prompt caching can cut real bills 30-70% on workloads with a stable system block. The deep audit models your actual usage with caching included.
| Model | Input ($/MTok) | Output ($/MTok) | Best for |
|---|---|---|---|
| Claude Sonnet 4 | $3.00 | $15.00 | General agents, code, default production |
| Claude Opus 4.1 | $15.00 | $75.00 | Hardest reasoning, multi-step planning |
| Claude Haiku 4.5 | $1.00 | $5.00 | Classification, extraction, high-volume Q&A |
One-page rate reference + one-page "5 ways your Claude bill goes 3x over list price" — most common waste patterns we see in real audits. PDF sent to your inbox.
For each model: monthly_cost = (input_tokens × input_rate / 1,000,000) + (output_tokens × output_rate / 1,000,000)
Example: 5 million input tokens + 1 million output tokens on Sonnet 4 = (5 × $3) + (1 × $15) = $30/month at list price. Same workload on Opus 4.1 = (5 × $15) + (1 × $75) = $150/month. Same workload on Haiku 4.5 = (5 × $1) + (1 × $5) = $10/month.
The 15x cost gap between Haiku and Opus is real and rarely justified for the same task — most production agents over-pay 2-4x by defaulting to the highest-tier model when the workload doesn't need it. That's the single biggest line in most Claude bills we audit.
Published per-million-token rates as of 2026-05 from Anthropic's pricing page. Sonnet 4: $3 input / $15 output. Opus 4.1: $15 / $75. Haiku 4.5: $1 / $5. Anthropic occasionally updates these — always cross-check against anthropic.com/pricing before signing a contract.
From console.anthropic.com, open the Usage tab and export your last 30 days. The export breaks down input vs output tokens per model. If you don't have a billing relationship yet, a rough heuristic: 1,000 tokens is about 750 English words. A typical chat turn is 200-500 input tokens and 100-300 output tokens.
No. List-price math only. Prompt caching offers up to a 90% discount on cache reads — for prompts with a stable system block, real bills are often 30-70% below this estimate. If you want a workload-specific cached-price model, run the $299 LLM Bill Triage on your actual usage export.
Haiku 4.5 for high-volume classification, extraction, and short Q&A — fastest and cheapest. Sonnet 4 for general agents, code generation, and most production use — the default in 2026. Opus 4.1 for the hardest reasoning, multi-step planning, and tasks where one wrong answer is expensive — 5x Sonnet, only worth it where it earns the markup.
No. Token volumes and rate math run locally in your browser. The page fires an anonymous pageview beacon and CTA-click events so we can measure whether the calculator is useful — no inputs, no email (unless you submit one to the cheat-sheet form), no raw IP stored.