Gateway check

Block bad calls before they execute.

The Audit Vault is the after-the-fact record. The gateway-check endpoint is the before-the-fact decision. Same engine, same policies, same detectors -- flipped from observe to enforce.

How it works

Call POST /api/v1/gateway/check with the same event shape you'd send to /api/v1/audit/ingest, but BEFORE you make the model call. The response includes an explicit action field:

allow: no enabled policy matched. Proceed.
alert: a policy matched but its action is alert, not block. Proceed and let the receipt and alert flow handle it.
block: at least one enabled policy returned action=block. Do not proceed. Return the findings to the upstream caller.

Allowed and blocked decisions are both persisted to your Audit Vault, so you have a complete trail of what was attempted, not just what executed.

Drop-in pattern

import requests, openai

def safe_chat_completion(messages, model="gpt-4o"):
    check = requests.post(
        "https://prova.cobound.dev/api/v1/gateway/check",
        headers={"Authorization": f"Bearer {os.environ['PROVA_API_KEY']}"},
        json={
            "kind": "model_call",
            "model": {"provider": "openai", "name": model},
            "source": {"app_id": "support-bot", "environment": "production"},
            "payload": {"messages": messages},
        },
        timeout=2.0,
    ).json()

    if check["action"] == "block":
        raise PolicyBlocked(check["findings"])

    return openai.chat.completions.create(model=model, messages=messages)

Total added latency for the allow path is ~80ms (one round-trip to Prova + the synchronous policy + detector pass). Blocking responses have the same shape with action: "block" plus the matched findings. You decide how to surface them to your upstream caller.

What gets evaluated

The gateway-check endpoint runs the same engine as ingest:

All enabled policies for your org (built-in + customer-authored).
All enabled inline detectors (prompt injection, PII leak).
The highest action across matches wins: block > alert > allow.

External-API detectors (coordination loops) do not run on the gateway path because they require multi-step history. Use them on the post-execution ingest surface instead.

Response shape

{
  "mode": "authenticated",
  "action": "block",
  "blocked": true,
  "event_id": "9f4a2e71d03bcafe...",
  "findings": [
    {
      "detector": "policy:secret_in_prompt",
      "verdict": "policy_violation",
      "severity": "critical",
      "summary": "Secret credential detected in event payload.",
      "details": { "pattern": "\\bsk-[A-Za-z0-9]{20,}\\b" },
      "remediation": "Rotate the exposed credential immediately..."
    }
  ],
  "receipt": { /* full signed AIDecisionEvent */ }
}

Tuning what blocks vs alerts

Each policy ships with a default action. Override it per-org from the policy dashboard. Common patterns:

Run in alert-only mode for the first week to baseline. Find out what your system would actually have blocked, without breaking traffic.
Promote to block per policy once you trust it (data-protection ones tend to graduate fastest; safety ones can stay alert-mode in some workloads).
Stage a staging environment with all policies in block mode + production in alert mode. Diff the receipts to find regressions before they ship.

Latency budget + failure mode

The gateway-check endpoint runs entirely synchronously inside your request path. Typical latency:

Network: ~30-60ms round-trip to Vercel edge.
Policy + detector evaluation: < 5ms for the full library against a typical payload.
Receipt signing: < 1ms (Ed25519).
Audit persistence: best-effort, async-tolerant.

Decide upfront how to handle Prova being unavailable: fail-open (proceed with the model call) or fail-closed (refuse). Most teams start fail-open in dev and fail-closed in regulated production paths. Either way, set a tight timeout (~2 seconds) on your client.

Tune your blocking policies →Or see the post-execution ingest path