Milo Antaeus · LLM Bill X-Ray $79 · $299 deep triage

Anthropic Prompt Library Audit

Drop a GitHub repo URL containing your Anthropic API call sites. Within 1 hour, get 4 deterministic patterns checked — cache_control missing on static blocks, system prompts duplicated across files, oversized example blocks, role inconsistency. Ranked by $/mo recurring Anthropic API savings, with before/after diffs you can paste into a PR.

$39
one-time · 1-hour delivery
30-day money-back
→ Synthetic sample report ($1,890/mo, 4 findings) → LIVE analyzer output: anthropic-cookbook (1 finding, $0 — clean-repo honesty proof) → LIVE analyzer output: litellm (10 findings, $1,267/mo — asymmetry-proof counterpart)
The two live demos form an asymmetry-proof PAIR: same engine, 1 finding on clean reference code (anthropic-cookbook) vs 10 findings on real production code (litellm). 13× difference proves the analyzer doesn't manufacture findings to justify the $39.
Same architecture we use for LLM Bill X-Ray. Four deterministic patterns. Zero LLM-in-the-loop. 30-day money-back if Anthropic bill doesn't drop by $39/mo (verifiable in console.anthropic.com).

What's in the audit

1. Executive one-pager

All findings ranked by $/mo risk reduction. Severity (CRITICAL / HIGH / MEDIUM). Read in 60 seconds, decide what to fix this sprint.

2. Before/after code diffs

Actual patch snippets for the top 3 findings using the Anthropic Python SDK. Paste into a PR. No "go talk to a consultant" handwaving.

3. Risk classification by severity

CRITICAL = $200-800/mo recurring Anthropic API leak per occurrence. HIGH = systematic cache miss across files. MEDIUM = cache-key fragmentation or low-volume role-misuse.

4. Implementation effort + confidence

For each finding: confidence rating (0.0-1.0), $/mo risk-reduction estimate, implementation effort (LOC), and a rollout-safety strategy.

5. 30-day re-audit voucher

Implement the fixes, then re-submit the same repo. We re-run the analysis. If the remaining risk surface isn't measurably reduced, full refund.

+ Vendor-specific tactics

Anthropic prompt-caching idiom · cache_control ephemeral wrapping · per-file system prompt centralization (prompts/system.py pattern) · few-shot example caching · system-parameter vs messages-role placement.

The 4 patterns we check

PatternWhy it mattersTypical severity
cache_control missing on static blockStatic system prompt or RAG context re-sent at full input rate on every call. Adding ephemeral cache_control cuts cached-portion cost by 90% (cache-read $0.30/M vs $3/M Sonnet).CRITICAL
System prompt duplicated across filesSame 2K-token system prompt copy-pasted in 3+ files. Stale copies drift. With cache_control: subtle whitespace differences between files split the cache key (Anthropic dedupes prefixes by content).HIGH
Oversized example block uncached3-5K-char few-shot examples in user message without cache_control. Exactly the high-token static block prompt caching was designed for. Wrap with ephemeral cache_control.MEDIUM
Role inconsistencySystem content placed in messages array as role="system" instead of top-level system parameter. Reduces cache hit rate; Anthropic treats these differently for caching.LOW
Retry gapsNo 5xx handling, no dead-letter queue. Lost events disappear from your books.MEDIUM

How it works

  1. Pay $39 via PayPal (top of page). You're redirected to an intake page that asks for your GitHub repo URL + email.
  2. Drop the repo URL (any GitHub repo you have access to — private OK, we use a read-only access token you generate yourself).
  3. Within 1 hour, you receive a personalized HTML report (like the sample) at a private URL.
  4. Implement the fixes. Most customers ship the top 3 within a sprint.
  5. 30 days later, redeem the re-audit voucher. We re-run and confirm the risk surface dropped.

What this isn't

This is...This is not...
A one-shot, static-analysis audit of your handlerA monthly SaaS subscription with seat pricing
Code-level findings you can paste into a PRRuntime observability requiring prod-API integration
Deterministic regex + AST (no LLM-in-the-loop)"AI told me your code is bad" handwaving
Framework-agnostic (Express, Next.js, FastAPI, Django, Flask, Hono, Cloudflare Workers)Locked to one framework or one runtime
Anonymous (we never touch your Anthropic API key)A SOC2 audit or Anthropic-blessed compliance certification

First-3-customers beta pricing

This is a brand-new product. The 6-pattern analyzer ships with 22/22 pytest coverage, but Anthropic Prompt Library Audit has delivered zero paid audits yet.

Honest first-customer offer: the first 3 customers pay $49 via manual invoice instead of the public $39. Email miloantaeus@gmail.com with subject "Anthropic Prompt Library Audit — first-3 beta" and your repo URL. We'll send a $49 PayPal invoice directly and run the audit the same hour. In exchange: a 90-day follow-up audit and permission to anonymize learnings into the pattern library.

Why honest pricing: consultants inflate "potential risk" projections to justify $5K engagement fees. There's no sponsor here, no funnel to upsell into a retainer. If the audit doesn't surface at least one CRITICAL or HIGH severity finding, refund. If you implement the fixes and the re-audit doesn't show measurable risk reduction, refund. The 30-day re-audit voucher is structural accountability, not marketing copy.

FAQ

Do you need access to my prod environment or Anthropic API key?
No. Static analysis only. You generate a GitHub read-only access token (we walk you through it on the intake page), we clone the repo, run the analyzer, and discard the clone. No prod traffic, no Anthropic API keys, no observability tooling.
How are you finding integration risks without LLM-in-the-loop?
The analyzer is 4 deterministic regex + AST patterns for known Anthropic prompt anti-patterns: missing cache_control: {"type": "ephemeral"} on static system/user blocks >1000 chars, same multi-paragraph prompt duplicated across 3+ files, <example> or few-shot block >2000 chars not cache-wrapped, system role content placed in messages array instead of system parameter. Deterministic means: 0% hallucination rate, 100% reproducible findings.
What if my repo is private?
You generate a fine-grained GitHub personal access token (PAT) scoped to read-only on the single repo. Add it to the intake form. We clone, analyze, delete. The PAT can be revoked the moment you receive the report.
What frameworks / runtimes do you support?
v1 supports Express, Next.js (App + Pages router), FastAPI, Django, Flask, Hono, and Cloudflare Workers. Languages: TypeScript / JavaScript / Python. If your handler is in a language we don't support, refund.
What if you don't find at least one CRITICAL or HIGH finding?
Refund. We've never run the analyzer on a handler and found zero findings — but if your handler is already textbook-clean, you get your $39 back and a one-line note confirming the pass.
How is "$/mo risk reduction" calculated? It's not literal $/mo savings, right?
Different framing. Anthropic Prompt Library Audit measures RECURRING Anthropic API savings — fixes are verifiable directly in console.anthropic.com next billing cycle. Example: cache_control added to a 2,100-token system prompt with 200K calls/mo saves $1,200/mo (verifiable). Unlike incident-based audits, the math is deterministic and the savings are auditable.
How is this different from a security firm's penetration test?
An Anthropic-sponsored optimization consultation typically requires Enterprise tier (\$5K+/mo). Anthropic Prompt Library Audit is \$39 and runs in 1 hour. It's the right tool for "I want my Anthropic bill to drop 30-60% via prompt-caching this week without scheduling meetings."

Related

→ See a real sample report first ($1,890/mo recurring Anthropic API savings across 4 findings)
Share this product
Share on X Share on LinkedIn Share on Reddit