Accountability
Verifiable identity, audit-grade liability records, reputation, and agent-to-agent settlement. The framework's answer to: who is responsible for what an agent did, can it be re-verified later, and can it transact across trust boundaries?
The com.tnsai.identity, com.tnsai.accountability, and com.tnsai.payment packages in tnsai-core ship the primitives:
AgentPrincipal+AgentPrincipalProvider— verifiable identity with version + runtime fingerprint + optional attestationAgentLiabilityRecord+LiabilitySink— append-only audit log of every dispatchAuthorityScope— pre-declared bounds (allowed actions, target systems, spend ceiling, validity window)ReputationLedger+ReputationScore— class-weighted aggregate score derived from liability historyPaymentBroker+Quote+Settlement— agent-to-agent settlement with idempotency
Pairs with Long-Running Runs (every checkpoint can carry the principal that produced it) and Cost Governance (the PaymentBroker and AuthorityScope.spendCeilingUSD cooperate with CostBudget to bound spend).
Why a separate layer
The framework already has rich observability — LLMCallLog, OpenTelemetry spans, hook events. That layer answers what happened. Accountability answers a different question: who is responsible, with attribution strong enough that an external auditor (insurance underwriter, compliance review, regulator) trusts the record.
Three forces are pushing this:
- Insurance / institutional governance — AIUC-1 certified agent providers need agent identity, action audit, liability scoping, certification standard.
- Decentralized agent economy — ERC-8004 (on-chain identity) and x402 (HTTP 402 micropayments) let agents prove identity and transact peer-to-peer without a central authority.
- Regulated deployments — security-domain harnesses (XBOW-style benchmarks, code-review automation) produce findings with real-world consequences. Reproducibility and attribution become non-optional.
Without a first-class accountability SPI, every consumer rebuilds the same bookkeeping ad-hoc.
Quick start
import com.tnsai.identity.*;
import com.tnsai.accountability.*;
import com.tnsai.agents.AgentBuilder;
import java.nio.file.Path;
import java.util.List;
// 1. Mint a principal for the agent. LocalIdentityProvider is the
// process-local default; external providers (ERC-8004, AIUC-1, X.509)
// plug in via the same SPI.
AgentPrincipalProvider provider = new LocalIdentityProvider();
AgentPrincipal principal = provider.issue(
AgentDescriptor.builder()
.agentClass("com.example.ResearchAgent")
.systemPrompt(systemPromptText)
.toolNames(List.of("web.search", "github.create_pr"))
.model("openai:gpt-5")
.buildHash(buildHashFromManifest())
.build());
// 2. Pick an audit sink. JSON-Lines on disk is a good default; swap in
// a custom adapter for AIUC-1 / S3-with-object-lock when needed.
LiabilitySink sink = new FilesystemLiabilitySink(Path.of("/var/audit/agent.jsonl"));
// 3. Optionally narrow the scope. Auditors compare attempted actions
// against this declared bound to flag scope violations.
AuthorityScope scope = new AuthorityScope(
java.util.Set.of(com.tnsai.enums.ActionType.LLM,
com.tnsai.enums.ActionType.MCP_TOOL),
java.util.Set.of("github.com/org/repo", "openai.com/v1"),
java.util.Optional.of(new java.math.BigDecimal("50.00")),
java.time.Duration.ofHours(1),
java.time.Instant.now());
// 4. Wire on the AgentBuilder.
Agent agent = AgentBuilder.create()
.id("research-agent-1")
.llm(...)
.role(myRole)
.principal(principal)
.liabilitySink(sink)
.authorityScope(scope)
.build();From this point on, every ActionExecutor.execute(...) dispatch produces one AgentLiabilityRecord published to the configured sink — successful actions carry a result hash, failing actions carry an ERROR:-prefixed outcome.
Identity
AgentPrincipal is the audit primitive. Four fields:
| Field | Meaning |
|---|---|
id() | Stable identifier across every dispatch from the same logical agent instance. UUID for LocalIdentityProvider; DID / on-chain address for external providers. |
version() | Stamp that changes when the agent's behaviour changes (system prompt, tool list, model swap). Audit findings against a prior version don't apply once the version flips. |
attestation() | Optional cryptographic / external-system proof binding the identity to an outside trust root (ERC-8004, X.509, AIUC-1, JWT). Empty for LocalIdentityProvider. |
runtimeFingerprint() | Hash of build artefact + config. Two physical processes running the exact same binary with the exact same config produce the same fingerprint. |
AgentPrincipalProvider is the SPI for issuing and verifying. LocalIdentityProvider is the framework default — UUID-based, HMAC-SHA-256 signing, no external trust root. Suitable for single-tenant deployments where the framework is the trust boundary.
// Mint and sign an identity.
LocalIdentityProvider p = new LocalIdentityProvider();
AgentPrincipal me = p.issue(spec);
byte[] sig = p.sign(me);
// Re-verify later (round-trips through the same instance only;
// cross-process verification needs an external attestation backend).
Optional<AgentPrincipal> verified = p.verify(me.id(), sig);
AgentPrincipalvsAgentIdentity. The pre-existingcom.tnsai.models.agent.AgentIdentityis a personality trait ("analytical", "friendly") used to shape system prompts.AgentPrincipalis a verifiable security primitive. They coexist — different concerns.
Liability records
Every dispatch produces an AgentLiabilityRecord:
public record AgentLiabilityRecord(
String agentId, // from AgentPrincipal.id()
String agentVersion, // from AgentPrincipal.version()
Instant invokedAt,
Action action, // typed audit-grade descriptor
AuthorityScope scope, // bounds in force at invocation
LiabilityClass liability, // NONE / LOW / MEDIUM / HIGH / REGULATORY
String correlationId, // links to broader OTel trace / LLMCallLog
Optional<String> outcome, // result hash or ERROR: message
Optional<byte[]> signature // when the provider is signing records
) {}The record is append-only by contract — implementations should enforce write-once semantics. RecordingLiabilitySink (in-memory, for tests) and FilesystemLiabilitySink (JSON-Lines on disk) ship as defaults; external adapters (AIUC1LiabilitySink, S3LiabilitySink with object-lock) follow the same SPI.
Liability tiers
| Tier | Use case |
|---|---|
NONE | Internal plumbing — log emit, counter increment. Auditors typically skip. |
LOW | Cheap-to-reverse side effect — read-only external lookup, cache write. |
MEDIUM | Mutates state inside the framework or a system the framework controls — DB write, internal queue put. |
HIGH | External-system mutation with material consequence — public post, GitHub PR, build trigger. |
REGULATORY | Regulated reach — payment, regulated data egress, anything covered by a compliance regime. Auditors review every record. |
The framework's default classification by dispatch type:
ActionType | Default tier |
|---|---|
LOCAL | LOW |
LLM / MCP_TOOL / WEB_SERVICE | MEDIUM |
Operators with stricter classification needs override at the sink boundary (write a wrapping sink that reclassifies before persisting) or by attaching tier metadata via a custom action descriptor.
Authority scope
AuthorityScope declares what the agent is allowed to do at dispatch time:
AuthorityScope scope = new AuthorityScope(
Set.of(ActionType.LLM, ActionType.MCP_TOOL), // allowed action types
Set.of("github.com/org/repo", "openai.com/v1"), // allowed targets
Optional.of(new BigDecimal("50.00")), // spend ceiling USD
Duration.ofHours(1), // validity window
Instant.now());
scope.permits(ActionType.LLM); // true
scope.permits(ActionType.LOCAL); // false (not in whitelist)
scope.permitsTarget("evil.com"); // false
scope.isExpired(Instant.now().plus(Duration.ofHours(2))); // trueEmpty whitelist = permissive (caller chose not to narrow). Populated whitelist = explicit membership check. Auditors compare each record's action against scope to flag violations; the default InMemoryReputationLedger counts scope violations as misses.
AuthorityScope.unrestricted(validFor) is the maximally-permissive default — auditable but unconstraining.
Reputation
ReputationLedger derives a normalised [0.0, 1.0] score from an agent's liability history:
LiabilitySink sink = new FilesystemLiabilitySink(Path.of("audit.jsonl"));
ReputationLedger ledger = new InMemoryReputationLedger(sink);
ReputationScore score = ledger.scoreOf("research-agent-1");
// score.score() = 1.0 - (weighted_misses / weighted_total)
// score.recordCount() = how many records the score was computed fromThe default InMemoryReputationLedger weighting:
- Pass — outcome is non-error AND action is permitted by its declared scope (action type + target).
- Miss — anything else (error outcome, scope-violating action, target violation).
- Class weights —
NONE/LOW= 1,MEDIUM= 3,HIGH= 9,REGULATORY= 27. One regulatory miss outweighs many low-tier misses.
Unknown agents (no records) get 1.0 — innocent until proven. Operators with stricter onboarding override in a custom ledger.
// Walk recent history for a stale-score check.
List<AgentLiabilityRecord> last24h = ledger.historyOf(
"research-agent-1", Duration.ofHours(24));Payment
PaymentBroker is the SPI for agent-to-agent settlement. Two-phase contract: quote prices the work, settle moves funds.
PaymentBroker broker = new NoOpPaymentBroker(); // 0.00 USD, instant settlement
Service work = new Service("github.create_pr", "request",
BigDecimal.ONE, Optional.empty(), Map.of());
Quote q = broker.quote(payerPrincipal, payeePrincipal, work);
Settlement result = broker.settle(q);
switch (result) {
case Settlement.Settled s -> /* funds moved; s.transactionId() */;
case Settlement.AlreadySettled a -> /* idempotency replay */;
case Settlement.Rejected r -> /* broker policy refused */;
case Settlement.Expired e -> /* quote past expiresAt */;
}Idempotency is built in: Quote.idempotencyKey is opaque to the framework, broker-derived. A second settle call with the same key returns Settlement.AlreadySettled carrying the original transactionId — retrying a flaky network call doesn't double-charge.
NoOpPaymentBroker is the default for internal multi-agent (no real money flow). External brokers — X402PaymentBroker (HTTP 402 micropayment), StripePaymentBroker, ledger-backed adapters — plug into the same SPI.
What's not in this layer
Out of scope for the framework primitives, tracked separately:
- ERC-8004 / x402 / AIUC-1 actual adapter implementations — open per-adapter follow-up issues; the SPI is stable today
- UI for liability record viewing / reputation dashboard —
LiabilitySink.queryis the data plane; UI is downstream - KMS integration for cryptographic key management —
LocalIdentityProvideruses an in-process HMAC key; KMS-backed providers ship as separate adapters - Capability-level liability declaration (
@Capability(liability = HIGH)) — annotation surface enhancement; tracked separately
See Also
- Long-Running Runs — checkpoint records can carry
AgentPrincipalfor audit continuity across resumes - Cost Governance —
AuthorityScope.spendCeilingUSDis the per-scope counterpart toCostBudget's tenant-wide spend ledger - Approvals and Annotations —
@ApprovalRequiredworks alongside accountability; approvals gate access, liability records track the gated action - Redaction —
Action.parameterssnapshots must be redaction-safe; the redaction layer is the source of truth for what can leave the trust boundary