Run health: a label-free verdict on every agent run
Observability shows you traces. Run health gives you a verdict. Every agent run now gets a 0 to 100 health score and a letter grade, read straight from the signals already in your receipts. No eval set, no labels, no LLM.
A score you can act on. Clean runs auto-pass. Clearly-broken runs (a coordination loop, a blocked call, a high or critical finding) auto-flag. Only the ambiguous middle is routed to a human. You triage by exception instead of reading every trace.
Every point off is explained. The score is 100 minus the sum of named signal penalties: coordination loop (45), blocked call (25), severe finding (20 to 30), no-progress cycle (20), step blowup (15), repeated tool call (12), medium finding (10). A poor grade tells you what went wrong, not just that something did. The dashboard shows each signal with its penalty and detail.
Deterministic, and the same everywhere. The scorer is pure: no network, no clock, no model call. It runs offline in the SDK with no account (prova-local), and on /dashboard/health once you ingest. The Python and Node ports match the server. Free on every plan.
It answers whether the run was healthy, not whether the final answer was semantically correct. That line is deliberate: a semantic judge would need an LLM, and this layer stays deterministic. Full detail in the run health guide.