Agent Failure Forensics — AI Log Diagnosis That Actually Tells You What Broke

How it works

Three steps from "something's wrong" to "here's exactly what to fix"

No onboarding calls. No configuration. No guesswork about where to start.

Upload sanitised logs

Drag and drop your agent logs, API traces, or cron output. Sanitise first — we don't need credentials, just the execution record.

→

Get a structured report

Within minutes, receive a diagnosis report with severity-ranked findings, each traced back to specific log entries — not vibes or hunches.

→

Fix with confidence

Use the included replay fixture to reproduce the failure in CI, and the regression checklist to confirm the fix before you ship.

Sample output

This is what your report looks like

Every finding is traceable to a specific line in your logs. You verify; you don't trust.

agent-failure-forensics-report-2026-05-08.html 3 findings · 1 fixture · 1 checklist

CASCADING HALLUCINATION — downstream data corruption Critical

An LLM-generated tool parameter was accepted without schema validation. Downstream tool call succeeded silently with wrong input, producing corrupted vector-store entries that propagated to the next three agent tasks.

Evidence chain

[09:14:02] LLM output → {"tool": "upsert_vector", "param": {"id": "usr_0091", "score": "0.91", ...}} [09:14:02] Schema validation: PASSED (param.id is string, LLM returned string — type match only, semantic mismatch) [09:14:03] Tool call: upsert_vector → SUCCESS [09:14:08] Downstream read: returned score=0.91 instead of expected 0.73 → Next 3 agent tasks built on wrong confidence score.

SILENT TOOL CALL FAILURE — no exception thrown High

Tool call returned HTTP 429 (rate limit) but the agent scaffold swallowed the error and retried with identical parameters, burning 4× the expected token budget before moving on without the intended result.

Evidence chain

[11:02:11] POST /api/embed → HTTP 429, retry=1 [11:02:13] POST /api/embed → HTTP 429, retry=2 [11:02:15] POST /api/embed → HTTP 429, retry=3 [11:02:17] Agent continued without embedding — no exception raised, no user notification. Token waste: ~$0.38 at current rate × 3 retries × 4 similar incidents today = ~$4.56/day.

ORCHESTRATION LOOP — unbounded re-plan cycle Medium

The agent entered a 7-step re-plan loop triggered by a low-confidence classification. No guardrail was in place to break the cycle after N failed attempts, resulting in 38 identical LLM calls and $1.12 in token waste.

Evidence chain

[14:33:01] Classification confidence: 0.31 (below 0.40 threshold) [14:33:04] Re-plan triggered → confidence: 0.29 [14:33:08] Re-plan triggered → confidence: 0.33 [14:33:11] ... (4 more iterations, confidence: 0.28, 0.31, 0.30, 0.29) [14:33:28] Loop broken by external timeout after 27 seconds. Fix: Add max_replan_attempts=2 guardrail; fallback to human escalation.

Want your own report? Upload your logs and get a diagnosis within minutes of launch. Get early access →

Pricing

Early access — locked in for life

Join before launch. Your rate never goes up. Cancel anytime.

Monthly

^$29

per month

Unlimited log uploads
Full diagnosis report per upload
Replay fixture download
Regression checklist
Error-budget metric
Email support
Annual billing saves $99/yr

Start monthly

🔒 Early access rate

Best value

Annual

^$249

per year — saves $99

Everything in Monthly
Locked at $20.75/month
Priority processing queue
Direct Slack access to Milo
Feature requests prioritised
12 months, cancel anytime
No refunds on annual plans

Get annual access

🔒 Early access rate — lock it in

Early access begins when the product launches. You'll be notified by email and given first access before the waitlist opens to the public.

🔒

No credentials needed

You sanitise your logs before uploading. No API keys, no production access, no PII in our system.

⚡

Minutes, not days

Structured diagnosis within minutes of upload. Full report, fixture, and checklist ready to act on.

📋

You verify, not trust

Every finding is traceable to a specific log entry. Dispute, extend, or confirm — your call.

FAQ

Questions, answered

What does the diagnosis report actually contain?

A structured report with: severity-ranked failure modes, traceable evidence chains linking each finding to your raw log entries, a replay fixture (deterministic test case), a regression checklist, and an error-budget metric. Everything is scoped to the logs you upload.

What counts as "AI agent logs"?

Any text output from an autonomous AI operator setup: LLM API call logs, tool-use traces, cron output, agent scaffold logs, or exported conversation histories. If it records what your agent did, it can be diagnosed.

Is my data handled securely?

Yes. You sanitise your own logs before uploading. No credentials, no production systems, no PII are required. Logs are processed and discarded after your report is delivered.

How does early access pricing work?

Early access is $29/month or $249/year, locked in for life. After the full launch, the price increases. Join the waitlist now to secure your rate.

What if no failures are found?

If the diagnostic surfaces zero actionable findings, you'll receive a clean bill of health with recommendations for ongoing monitoring. You still get the full report.

Can I cancel early access at any time?

Yes. Cancel monthly anytime. Annual plans are non-refundable but remain active for 12 months from signup.

Upload your logs. Get a diagnosis.
Know exactly what broke.

Three steps from "something's wrong" to "here's exactly what to fix"

Upload sanitised logs

Get a structured report

Fix with confidence

This is what your report looks like

Early access — locked in for life

No credentials needed

Minutes, not days

You verify, not trust

Questions, answered

Stop guessing why your agent failed.
Start with the report.

Upload your logs. Get a diagnosis.Know exactly what broke.

Three steps from "something's wrong" to "here's exactly what to fix"

Upload sanitised logs

Get a structured report

Fix with confidence

This is what your report looks like

Early access — locked in for life

No credentials needed

Minutes, not days

You verify, not trust

Questions, answered

Stop guessing why your agent failed.Start with the report.

Upload your logs. Get a diagnosis.
Know exactly what broke.

Stop guessing why your agent failed.
Start with the report.