Trust protocol for the AI-assisted SDLC

Trust is a trajectory, not a checkpoint.

Coding agents earn autonomy from evidence, one verified change at a time — and human review recedes exactly where the evidence says it can, with a receipt for every merge you no longer read. An open, language- and transport-agnostic protocol. Zero runtime dependencies.

Read the quickstart → github.com/yuvalraz/recede SPEC.md

trust(code-agent, code.fix) · ~30 verified changes

trust score checkpoint fires reverted outcome receded / autonomous

One coding agent on code.fix the whole way across. Day 1, every change is reviewed — Verify (CI, tests, types green) and Validate (it did what the ticket asked, at quality) — and the diamonds thin out as clean fixes compound until review has receded and low-risk fixes merge autonomously. A schema.migrate still gates every time — irreversible actions never recede. Then an autonomous fix is reverted in staging, trust drops below the tier floor, and review snaps back automatically. No one edited a rule; the evidence moved.

The inversion

Everyone else fights AI-PR review fatigue by shipping more to watch.

You put an agent in the dev loop — it plans, writes code, adds tests, opens PRs. No human can meaningfully read 40 agent PRs a day, so review collapses into rubber-stamping (review theater) or bottlenecking (the agent's speed is wasted). The root cause is a trust-calibration bug: trust today is mis-attributed — one global "do I trust the AI?" verdict, when trusted to fix a flaky test and trusted to run a migration are different questions — and mis-calibrated — granted by feeling, not evidence. The usual answer is a bigger dashboard, a 0–1000 score, more alerts. Wrong direction.

Fix attribution (per Actor × TaskType) and calibration (evidence + confidence), and small daily verified wins compound into earned, bounded autonomy — so review recedes exactly where warranted.

The reflex

Measure trust → build a dashboard → watch more.

invert →

Recede

Measure trust per capability → let review recede → read fewer PRs, on purpose — and snap back the instant a merge regresses.

The model

Five bullets, and you have the whole thing.

Trust is scoped

Held per (Actor, TaskType) — never one global agent score. Trusted on code.fix ≠ trusted on code.migrate. Review recedes in one lane while staying tight in another.

Every action emits a Warrant

An append-only, hash-linked chain: intent → action → checks → outcome. Trust is a sum over receipts you can open. No Warrant, no trust movement.

V&V is first-class and split

Verify = did it do the thing right (CI, tests, types green). Validate = did it do the right thing — the change matches the ticket, at quality. Conflating "tests are green" with "it did what I asked" is how confidently-wrong code merges.

The Gate is a pure function

gate(trust, risk, policy) → checkpoint or autonomous. Same inputs, same decision, always replayable. That makes "review recedes as trust is earned" a provable property, not a vibe.

Asymmetric & bounded

Earned slowly, lost fast. Decays with staleness and drift. Irreversible actions — code.migrate, prod deploys — keep a human checkpoint at every tier. Earned autonomy is bounded, never unbounded.

Replay proves it

replay(warrants, policy) reconstructs the exact trust state from the receipts + pinned policy. "Why did this merge unattended?" is answered by pointing at the chain.

I1 scope isolation I2 replay reconstructability I3 irreversible floor · never_recede I4 trust can decrease I5 confidence cap I6 policy digest on every decision I7 gate / update / replay purity

Why it's different

It's not another scorecard. It's a layer above them.

Recede sits above interop (MCP/A2A), eval/observability tools, and static guardrails — consuming their signals as evidence rather than replacing them.

Incumbent	What it does	Recede's distinct axis
Eval / observability tools	Score each run in isolation	Trust has memory — carried forward per capability
Static guardrails / control standards	Apply the same checkpoints uniformly, forever	Review is proportional to earned evidence
Governance promotion-ladders	Earned, but coarse HR-style tiers + calendar time + sign-off	Continuous & machine-verifiable, per-action
Agent identity / A2A	Establish who the agent is	Not who it is — what it has earned

Quickstart

The whole framework is one call: wrap the function you already have.

Reference implementation — TypeScript primary, Python mirror. The gate is implicit: there is no if (needsReview) in your code. run() decides. Your existing CI, tests, and PR reviews become the evidence.

code-agent.ts

const r = new Recede({ ledger: new MemoryLedger(), checkpoint: consoleCheckpoint(), policy });

// Verify = did it right (CI green).  Validate = did the right thing (intent-fit).
const ciGreen  = check.verify("ci", io => io.output.ci === "green");
const intentOK = check.validate("intent-fit", async io => ({ ok: await reviewMatchesIntent(io.intent, io.diff), confidence: 0.8 }));

const outcome = await r.run(() => agent.implement(ticket), {
  actor:    "code-agent",
  taskType: "code.fix",
  intent:   `Fix ${ticket.id}: ${ticket.title}`,
  risk:     "reversible.low",
  checks:   [ciGreen, intentOK],
});

// The gate is IMPLICIT — run() decides whether a human is asked.
outcome.result;      // the change (or the human-edited version)
outcome.trust;       // { before, after, delta } for (code-agent, code.fix)
outcome.checkpoint;  // undefined once review has receded for low-risk fixes
outcome.warrant;     // the hash-linked chain: intent -> diff -> checks -> outcome

As the ledger accrues verified, validated changes, that same call site graduates from "always ask a human" to "merge autonomously" — and reverts the moment the agent regresses. You don't rewire anything. The trajectory does it.

The same protocol scales to a higher-stakes frontier — an agent issuing refunds, moving money — riding the identical records, gate, and invariants (examples/refund). But the everyday win is your SDLC.

Status & scope

v0.1 DRAFT — the protocol is the deliverable; the code is proof.

Breaking changes expected before 1.0. Designed clean-room from first principles and public prior art only — append-only logs, content addressing, risk matrices, calibration, human-in-the-loop gating, and verification-vs-validation from systems engineering.

v0.1 ships

Normative record schemas + trust-state model, tiers T0–T4, invariants I1–I7
Pure gate() + declarative Policy matrix
Pure update() / replay() reducers
First-class Verify / Validate checks
Reference weighting: asymmetric + decay + near-miss ratchet + confidence cap
TS reference + Python mirror, in-memory + append-only-file store
One CLI checkpoint surface, a cross-language conformance suite, runnable examples: sdlc (everyday) + refund (frontier)

Explicitly deferred

Cryptographic identity / PKI / DIDs (the sig shape is reserved)
ML / statistical scoring beyond the reference weighting
Distributed ledgers & consensus
A web dashboard — shipping one first would betray the anti-fatigue thesis
Multi-agent delegation, framework plugins
Compliance-framework mapping

Star it on GitHub → Read the full SPEC