Cryptographic Receipts for AI Agent Actions

Every tool call, every decision, every piece of evidence — signed, hash-chained, and independently verifiable. Here's how we built an audit trail that proves what an AI agent actually did.

receipt-chaincryptographyaudit-trailvaoselixir

Cryptographic Receipts for AI Agent Actions

When an AI agent says it verified a claim, how do you know it actually did? When it says it called a tool with certain arguments and got a certain result, how do you prove that happened?

Most agent frameworks answer this with logs. Logs are mutable, unauthenticated, and trivially forgeable. They are useful for debugging. They are useless for accountability.

VAOS answers this with a receipt chain: a hash-linked sequence of Ed25519-signed attestations that records every tool invocation, every evidence assessment, and every decision an agent makes. Each receipt is independently verifiable. The chain is tamper-evident. If a single receipt is altered, every subsequent hash breaks.

This is not theoretical. It runs in production today as four Elixir modules totaling roughly 800 lines.

Why This Matters

Three concrete scenarios where logs fail and receipts succeed:

Regulatory compliance. If your agent makes medical claims, financial recommendations, or legal assessments, you need to demonstrate what information it actually consulted. "The logs say so" does not survive an audit. A signed receipt chain with ALCOA+ metadata does.

Multi-agent disputes. When two agents disagree about what happened during a collaborative task, a shared receipt chain provides a single source of truth. Agent A claims it sent data to Agent B. The receipt either exists or it does not.

Post-incident forensics. After an agent produces a harmful output, you need to reconstruct the exact sequence of actions that led to it. Receipts give you a hash-linked timeline that cannot be retroactively edited.

The Architecture

The receipt chain has four components:

Tool Invocation
      |
      v
  [Bundle]  ──  Constructs the receipt (tool, args, result, timestamps)
      |
      v
  [Emitter] ──  Batches receipts, flushes to signing authority
      |
      v
  [Kernel]  ──  Signs attestation with Ed25519 private key
      |
      v
  [Verifier] ── Validates signatures and hash-chain integrity

Each component is independently deployable and independently testable.

Bundle: Structuring the Attestation

Receipt.Bundle takes a raw tool invocation and produces a structured map:

%{
  receipt_id: "rcpt_a1b2c3d4",
  tool_name: "investigate",
  agent_id: "daemon-primary",
  arguments_hash: "sha256:e3b0c44298fc...",
  result_hash: "sha256:7d793037a076...",
  timestamp: ~U[2026-03-25 14:32:01Z],
  prev_hash: "sha256:9f86d081884c...",
  alcoa: %{
    attributable: true,
    legible: true,
    contemporaneous: true,
    original: true,
    accurate: true
  }
}

The prev_hash field is the SHA-256 of the previous receipt in the chain. The first receipt in a session uses a genesis hash derived from the agent's identity and session start time. This creates an append-only linked structure where inserting, removing, or modifying any receipt invalidates everything after it.

Arguments and results are hashed rather than stored inline. The full payloads live in a separate evidence store. This keeps receipts small enough to transmit and verify quickly while still binding them to specific data.

The ALCOA+ flags come from FDA 21 CFR Part 11, the standard for electronic records in regulated industries. They are not decorative -- they signal to downstream consumers whether this receipt meets specific data integrity requirements.

Emitter: Batching and Delivery

Receipt.Emitter is an OTP GenServer that accumulates receipts in memory and flushes them to the signing authority on a configurable interval. This exists because signing every receipt individually would create unacceptable latency during high-throughput tool execution.

The emitter maintains a pending queue and a flush timer. When the timer fires or the queue reaches a size threshold, it:

  1. Serializes the batch as JSON
  2. POSTs to the kernel's /api/v1/receipts/attest endpoint
  3. On success, moves each receipt from pending/ to signed/ storage
  4. On failure, retries with exponential backoff

The pending-to-signed transition is atomic per receipt. If the process crashes mid-flush, pending receipts survive on disk and are retried on restart. No receipt is lost.

Kernel: Signing

The VAOS Kernel is a Go binary that holds the Ed25519 private key. It receives attestation requests over HTTP, signs the receipt hash, and returns the signature.

The signing operation is deliberately minimal: Ed25519.Sign(privateKey, []byte(attestation_hex)). The kernel signs the hex-encoded attestation string, not the raw bytes. This is an intentional design choice -- the signature is over the human-readable representation, making manual verification straightforward with standard tools.

The kernel never sees the full receipt contents. It signs a hash. This means the signing authority cannot read the agent's tool arguments or results, providing a separation between the audit function and the operational data.

Verifier: Validation

Receipt.Verifier performs two checks:

  1. Signature verification. Given a receipt, its attestation, and the kernel's public key, verify the Ed25519 signature. This proves the kernel attested to this specific receipt at signing time.

  2. Chain integrity. Walk the receipt chain from newest to oldest, verifying that each receipt's prev_hash matches the SHA-256 of the preceding receipt. If any hash mismatches, the chain is broken and the break point is identified.

There is also a Replayer module that can reconstruct the full history of an agent session from its receipt chain, producing a human-readable timeline of every action taken.

What This Does Not Do

Receipts prove sequence and integrity. They do not prove correctness.

A receipt proves that the investigate tool was called with specific arguments and returned a specific result. It does not prove that the result was factually accurate. That is the job of the epistemic governance layer (vaos-ledger) which tracks claim confidence, evidence quality, and adversarial challenges.

Receipts also do not prevent an agent from taking a bad action. They make it impossible to deny that the action was taken. This is accountability, not prevention.

The current implementation signs against a single kernel. There is no multi-party signing, no threshold signatures, and no decentralized attestation. These are future work items for scenarios requiring consensus among multiple authorities.

The Independence Question

Until this week, the receipt chain was gated behind a VAS_SWARM_ENABLED flag -- a leftover from an earlier architecture where receipts were bundled with a gRPC integration layer. This meant you could not run receipts without also enabling a network client to an external kernel service.

We fixed this by giving receipts their own RECEIPT_CHAIN_ENABLED flag and removing 1,300 lines of dead VAS Swarm code that the receipt chain never actually depended on. The receipt modules have zero imports from any networking or integration layer. They always were independent; now the configuration reflects that.

This matters because it means you can run a receipt chain in a completely offline agent. The emitter stores pending receipts to disk. When a signing authority becomes available, it flushes. The chain is valid regardless of network connectivity.

Running It

Enable receipts with a single environment variable:

RECEIPT_CHAIN_ENABLED=true mix phx.server

Verify the chain:

curl localhost:8089/api/v1/receipts/verify
# {"status": "valid", "verified_count": 47, "chain_intact": true}

Replay a session:

curl localhost:8089/api/v1/receipts/replay?session_id=sess_abc123
# Returns ordered timeline of every tool invocation with timestamps

What Comes Next

The immediate next step is wiring emit_async calls into the tool executor, so every tool invocation automatically generates a receipt without the tool author needing to think about it. This is infrastructure -- the receipt chain should be invisible to tool developers and unavoidable for auditors.

Longer term, we are exploring receipt aggregation across multi-agent systems, where agents operating in different processes or on different machines contribute to a shared receipt chain with causal ordering.

The code is open. The receipt chain modules are part of an upstream PR to OSA, the agent framework VAOS extends.