Progressive delivery

Don't let the bad deploy take traffic.

The release compare answers “did the deploy make it worse?” on demand. A rollout makes it a control loop. It watches a candidate release, ramps a canary on a clean verdict, and rolls back the moment a gated metric regresses. Every promote and rollback is a signed receipt in the Audit Vault.

What a rollout is

Prova does not route your traffic. A rollout is the verdict authority your CD system is gated on. The deterministic engine is the same one behind prova-eval: Wilson and Newcombe intervals, a seeded bootstrap, and a minimum run count so a thin sample never reads as a regression. No LLM and no network on the critical path, so a promote or rollback is reproducible.

  • Shadow mode (the default) is advisory. It signs a verdict on every tick and takes no action. Watch the auto-pilot make the right calls before you trust it.
  • Canary mode fires the rollout.promote and rollout.rollback webhooks on each transition, so your CD system shifts the traffic weight on the verdict.

The control loop

Each tick re-scores the cohorts and decides one thing:

  • Rollback the instant a gated metric regresses against the baseline (the confidence interval excludes zero in the wrong direction), at any weight. Terminal.
  • Promote when the verdict is clean and the current step has accumulated min_runs fresh candidate runs. That advances the canary to the next weight; reaching the last step is full promotion.
  • Hold otherwise: not enough data yet, or not enough fresh runs at this step.

A stepped ramp (the default is 5,25,50,100) re-proves the candidate at each weight before widening it. Set require_pairwise to also gate on the LLM pairwise judge; it fails safe, so a required-but-unavailable judge holds the rollout rather than auto-promoting.

Tag your releases

A rollout compares two releases, so every run has to carry one. Set PROVA_RELEASE (a commit SHA or version) in your deploy and the SDK tags every receipt with it. To pin per-input pairwise comparisons, also set a stable probe_id on the runs you replay.

Gate a deploy in CI

The prova-rollout CLI (in @cobound/prova-sdk) creates a canary and blocks until it settles. watch exits 0 on a full promotion, 1 on a rollback, 2 on a timeout, so a failed pipeline is an automatic gate.

# Declare a canary for the new build against the last good release.
ID=$(prova-rollout create \
  --app-id claims-agent \
  --baseline v36 \
  --candidate "$GIT_SHA" \
  --mode canary \
  --steps 5,25,50,100 \
  --min-runs 30 \
  --json | jq -r .id)

# Block until Prova promotes or rolls it back. Fails the job on a rollback.
prova-rollout watch --id "$ID" --interval 30 --timeout 3600

Auth is PROVA_API_KEY with the rollout.manage permission. Host override is PROVA_BASE_URL.

Drive it from your own controller

There is no background worker. A rollout advances when you tick it, so it runs the same on a managed deploy and air-gapped. Wire one of these into your CD reconcile loop or a cron:

  • POST /api/v1/rollouts/:id/evaluate ticks one rollout and returns the fresh state, the decision, and the full compare verdict. HTTP 422 when this tick rolled the candidate back, 200 otherwise.
  • POST /api/v1/rollouts/evaluate ticks every active rollout for the org in one call. The single cron entrypoint.
  • GET /api/v1/rollouts/:id reads the current state without a tick, for a cheap poll.

In canary mode each transition fires a signed webhook your CD receiver acts on:

// POST to your rollout.promote webhook
{
  "event": "rollout.promote",
  "data": {
    "rollout_id": "…",
    "candidate_release": "abc123",
    "baseline_release": "v36",
    "weight": 25,             // set the canary to this %
    "terminal": false,        // true when fully promoted (100%)
    "decision_event_id": "…", // the signed receipt to verify
    "reason": "clean verdict at 5%, advancing canary to 25%"
  }
}

Subscribe a webhook to rollout.promote and rollout.rollback in your webhook settings. Deliveries are HMAC-signed with the hook secret. Verify the X-Prova-Signature header before acting.

The receipt is the proof

Every promote and rollback is written to the Audit Vault as a signed rollout_decision receipt: the decision, the reason, the gated metrics, and the full compare verdict, with the same Ed25519 signature any other receipt carries. So “Prova rolled this deploy back” is not a log line you have to trust. It is a tamper-evident record you can verify offline, the same way an auditor would.

Manage rollouts in the dashboard.