Gateway check
Block bad calls before they execute.
The Audit Vault is the after-the-fact record. The gateway-check endpoint is the before-the-fact decision. Same engine, same policies, same detectors -- flipped from observe to enforce.
How it works
Call POST /api/v1/gateway/check with the same event shape you'd send to /api/v1/audit/ingest, but BEFORE you make the model call. The response includes an explicit action field:
allow: no enabled policy matched. Proceed.alert: a policy matched but its action is alert, not block. Proceed and let the receipt and alert flow handle it.block: at least one enabled policy returned action=block. Do not proceed. Return the findings to the upstream caller.
Allowed and blocked decisions are both persisted to your Audit Vault, so you have a complete trail of what was attempted, not just what executed.
Drop-in pattern
import requests, openai
def safe_chat_completion(messages, model="gpt-4o"):
check = requests.post(
"https://prova.cobound.dev/api/v1/gateway/check",
headers={"Authorization": f"Bearer {os.environ['PROVA_API_KEY']}"},
json={
"kind": "model_call",
"model": {"provider": "openai", "name": model},
"source": {"app_id": "support-bot", "environment": "production"},
"payload": {"messages": messages},
},
timeout=2.0,
).json()
if check["action"] == "block":
raise PolicyBlocked(check["findings"])
return openai.chat.completions.create(model=model, messages=messages)Total added latency for the allow path is ~80ms (one round-trip to Prova + the synchronous policy + detector pass). Blocking responses have the same shape with action: "block" plus the matched findings. You decide how to surface them to your upstream caller.
What gets evaluated
The gateway-check endpoint runs the same engine as ingest:
- All enabled policies for your org (built-in + customer-authored).
- All enabled inline detectors (prompt injection, PII leak).
- The highest action across matches wins: block > alert > allow.
External-API detectors (coordination loops) do not run on the gateway path because they require multi-step history. Use them on the post-execution ingest surface instead.
Response shape
{
"mode": "authenticated",
"action": "block",
"blocked": true,
"event_id": "9f4a2e71d03bcafe...",
"findings": [
{
"detector": "policy:secret_in_prompt",
"verdict": "policy_violation",
"severity": "critical",
"summary": "Secret credential detected in event payload.",
"details": { "pattern": "\\bsk-[A-Za-z0-9]{20,}\\b" },
"remediation": "Rotate the exposed credential immediately..."
}
],
"receipt": { /* full signed AIDecisionEvent */ }
}Tuning what blocks vs alerts
Each policy ships with a default action. Override it per-org from the policy dashboard. Common patterns:
- Run in alert-only mode for the first week to baseline. Find out what your system would actually have blocked, without breaking traffic.
- Promote to block per policy once you trust it (data-protection ones tend to graduate fastest; safety ones can stay alert-mode in some workloads).
- Stage a staging environment with all policies in block mode + production in alert mode. Diff the receipts to find regressions before they ship.
Latency budget + failure mode
The gateway-check endpoint runs entirely synchronously inside your request path. Typical latency:
- Network: ~30-60ms round-trip to Vercel edge.
- Policy + detector evaluation: < 5ms for the full library against a typical payload.
- Receipt signing: < 1ms (Ed25519).
- Audit persistence: best-effort, async-tolerant.
Decide upfront how to handle Prova being unavailable: fail-open (proceed with the model call) or fail-closed (refuse). Most teams start fail-open in dev and fail-closed in regulated production paths. Either way, set a tight timeout (~2 seconds) on your client.