Question 1

What logs do you need from me to begin the audit?

Accepted Answer

Three artefacts in JSONL format: (1) your bandit updater outcome log with NOOP records, (2) trajectory analyzer session grades, and (3) dispatch trigger events. At least 200 sessions with mixed NOOP and non-NOOP outcomes are needed for statistical significance. Anonymized or sampled logs are acceptable.

Question 2

My team is mid-sprint. Can this audit run in parallel without disrupting production?

Accepted Answer

Yes. The audit is purely analytical and operates on exported log snapshots only. It does not touch your production pipeline. All artefacts (replay fixture, reconciliation playbook) are read-only tools validated in a staging environment before any production change.

Question 3

The replay fixture is in Python — what if our stack uses a different runtime?

Accepted Answer

The Python fixture is a plain-logic reference implementation portable to Node.js, Go, or Rust within hours. The implementation guide includes pseudo-code for the synchronous handshake pattern so you can adapt to your language of choice without waiting for a custom implementation.

Question 4

What if the root cause turns out to be something other than the race condition?

Accepted Answer

You receive the artefacts regardless of which failure mode is confirmed. If the data points to block-throttle misclassification or trajectory grade inflation instead, the reconciliation playbook and implementation guide target that confirmed mechanism. The incident report documents the actual root cause.

Autonomous Agent Session Fidelity Audit Sprint

What You Get

How It Works

FAQ