1. Executive one-pager
All findings ranked by $/mo risk reduction. Severity (CRITICAL / HIGH / MEDIUM). Read in 60 seconds, decide what to fix this sprint.
Drop a GitHub repo URL containing your Anthropic API call sites. Within 1 hour, get 4 deterministic patterns checked — cache_control missing on static blocks, system prompts duplicated across files, oversized example blocks, role inconsistency. Ranked by $/mo recurring Anthropic API savings, with before/after diffs you can paste into a PR.
All findings ranked by $/mo risk reduction. Severity (CRITICAL / HIGH / MEDIUM). Read in 60 seconds, decide what to fix this sprint.
Actual patch snippets for the top 3 findings using the Anthropic Python SDK. Paste into a PR. No "go talk to a consultant" handwaving.
CRITICAL = $200-800/mo recurring Anthropic API leak per occurrence. HIGH = systematic cache miss across files. MEDIUM = cache-key fragmentation or low-volume role-misuse.
For each finding: confidence rating (0.0-1.0), $/mo risk-reduction estimate, implementation effort (LOC), and a rollout-safety strategy.
Implement the fixes, then re-submit the same repo. We re-run the analysis. If the remaining risk surface isn't measurably reduced, full refund.
Anthropic prompt-caching idiom · cache_control ephemeral wrapping · per-file system prompt centralization (prompts/system.py pattern) · few-shot example caching · system-parameter vs messages-role placement.
| Pattern | Why it matters | Typical severity |
|---|---|---|
| cache_control missing on static block | Static system prompt or RAG context re-sent at full input rate on every call. Adding ephemeral cache_control cuts cached-portion cost by 90% (cache-read $0.30/M vs $3/M Sonnet). | CRITICAL |
| System prompt duplicated across files | Same 2K-token system prompt copy-pasted in 3+ files. Stale copies drift. With cache_control: subtle whitespace differences between files split the cache key (Anthropic dedupes prefixes by content). | HIGH |
| Oversized example block uncached | 3-5K-char few-shot examples in user message without cache_control. Exactly the high-token static block prompt caching was designed for. Wrap with ephemeral cache_control. | MEDIUM |
| Role inconsistency | System content placed in messages array as role="system" instead of top-level system parameter. Reduces cache hit rate; Anthropic treats these differently for caching. | LOW |
| Retry gaps | No 5xx handling, no dead-letter queue. Lost events disappear from your books. | MEDIUM |
| This is... | This is not... |
|---|---|
| A one-shot, static-analysis audit of your handler | A monthly SaaS subscription with seat pricing |
| Code-level findings you can paste into a PR | Runtime observability requiring prod-API integration |
| Deterministic regex + AST (no LLM-in-the-loop) | "AI told me your code is bad" handwaving |
| Framework-agnostic (Express, Next.js, FastAPI, Django, Flask, Hono, Cloudflare Workers) | Locked to one framework or one runtime |
| Anonymous (we never touch your Anthropic API key) | A SOC2 audit or Anthropic-blessed compliance certification |
This is a brand-new product. The 6-pattern analyzer ships with 22/22 pytest coverage, but Anthropic Prompt Library Audit has delivered zero paid audits yet.
Honest first-customer offer: the first 3 customers pay $49 via manual invoice instead of the public $39. Email miloantaeus@gmail.com with subject "Anthropic Prompt Library Audit — first-3 beta" and your repo URL. We'll send a $49 PayPal invoice directly and run the audit the same hour. In exchange: a 90-day follow-up audit and permission to anonymize learnings into the pattern library.
Why honest pricing: consultants inflate "potential risk" projections to justify $5K engagement fees. There's no sponsor here, no funnel to upsell into a retainer. If the audit doesn't surface at least one CRITICAL or HIGH severity finding, refund. If you implement the fixes and the re-audit doesn't show measurable risk reduction, refund. The 30-day re-audit voucher is structural accountability, not marketing copy.
cache_control: {"type": "ephemeral"} on static system/user blocks >1000 chars, same multi-paragraph prompt duplicated across 3+ files, <example> or few-shot block >2000 chars not cache-wrapped, system role content placed in messages array instead of system parameter. Deterministic means: 0% hallucination rate, 100% reproducible findings.