Tutorial: Declarative resilience with @Traced, @Metered, and @Fallback
Wrap every action dispatch with structured tracing, latency + counter metrics, and automatic recovery on failure — through annotations alone. The framework's ActionExecutor applies the decorators in canonical order; your action body stays focused on the business logic.
Prerequisites
- Installation
- A Role with at least one
@ActionSpecmethod that calls something flaky (external API, slow disk, untrusted input)
The role
Full source lives at tnsai-integration/src/main/java/com/tnsai/integration/scop/examples/PaymentRole.java and is exercised by PaymentRoleResilienceIntegrationTest. The shape:
@RoleSpec(
name = "PaymentAgent",
description = "Charges accounts via a flaky external API; falls back to queueing on failure.",
responsibilities = {
@Responsibility(name = "Payments", actions = {"charge"})
}
)
public class PaymentRole extends Role {
// ... primary action + fallback method below
}The decorator stack
For each annotated action, ActionExecutor.executeInternal wraps the dispatch like this:
open Traced span ← MDC trace id + structured span.start log
↓
start Metered timer ← record latency + counter on close
↓
PRIMARY EXECUTION ← your action body (or LOCAL/WEB_SERVICE/LLM/MCP executor)
↓
on success → record success on metered + traced, return result
on failure → invoke @Fallback (with retries), record outcome,
return fallback's value or rethrow
↓
finally: close metered (record entry) + traced (emit span.end)Every annotation is independent — declare any subset.
Pattern 1 — @Traced for span-style observability
@ActionSpec(type = ActionType.LOCAL, description = "Charge a customer's account.")
@Traced(operationName = "payment.charge", includeArgs = true)
public String charge(
@Param(name = "requestId", description = "Idempotency token") String requestId,
@Param(name = "amount", description = "Amount in minor units (cents)") int amount
) {
// ...
}What happens at runtime:
- A trace id (UUID) is pushed onto the SLF4J
MDCunder the keytraceId. Every log line emitted from this dispatch (and any nested logger calls) carries the same id, making post-hoc grep correlation trivial. - Two structured log lines bracket the dispatch:
span.start op=payment.charge traceId=... args={requestId=req-1, amount=100}span.end op=payment.charge traceId=... duration_ms=42 status=success
- On failure:
span.end ... status=failure error=RuntimeException - Nested traced calls within the same dispatch reuse the outer span's trace id (no inner-evicted-by-outer issue).
includeArgs and includeResult default to false — opt in only when you need the data in the log (PII risk).
Pattern 2 — @Metered for counter + latency histogram
@Metered(name = "payment.charge", histogram = true)
public String charge(...) { ... }The framework records each invocation in an in-memory ResilienceMetrics registry:
- Counters:
total,success,failure(LongAdder under the hood for hot-path performance). - Histogram (when
histogram = true): powers-of-2 latency buckets — 1ms, 5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s, 30s, +∞. - Default name:
ClassName.methodNamewhennameis empty. - Class-level inheritance:
@Meteredon the Role class applies to every action that doesn't declare its own (method-level wins entirely — no merging).
Tests can snapshot the registry to assert metrics fired:
ResilienceMetrics.MetricSnapshot snap = ResilienceMetrics.getInstance()
.snapshot("payment.charge").orElseThrow();
assertEquals(50, snap.total());
assertTrue(snap.successRate() > 0.4);Phase 2 will add a MetricsSink SPI for routing the same emissions to Micrometer / Prometheus / OpenTelemetry. Phase 1 stays in-memory — same boundary as the other Phase-1 batches in this sprint.
Pattern 3 — @Fallback for automatic recovery
@Fallback works differently from the other two decorators: it doesn't wrap the action — it marks a separate method on the same Role as the recovery handler.
@ActionSpec(type = ActionType.LOCAL, description = "Charge a customer's account.")
@Traced(operationName = "payment.charge", includeArgs = true)
@Metered(name = "payment.charge", histogram = true)
public String charge(
@Param(name = "requestId", description = "Idempotency token") String requestId,
@Param(name = "amount", description = "Amount in minor units (cents)") int amount
) {
// 50% chance of failure (simulated flaky external API)
if (ThreadLocalRandom.current().nextBoolean()) {
throw new RuntimeException("Payment API unavailable");
}
return "CHARGED:" + requestId + ":" + amount;
}
@Fallback(forAction = "charge", maxRetries = 2)
public String chargeFallback(String requestId, int amount, Throwable failure) {
return "QUEUED:" + requestId + ":" + amount + " (reason: " + failure.getMessage() + ")";
}When charge throws, the framework:
- Catches the exception at the dispatcher's edge (no propagation to caller).
- Retries by re-invoking
chargeup tomaxRetries = 2additional times. - If all retries also fail, invokes
chargeFallbackexactly once with(requestId, amount, lastFailure). - Returns the fallback's value as if
chargehad returned it.
So a single dispatch gets 3 chances at the primary (1 + maxRetries=2) before falling back. With 50% failure rate, only ~12% of dispatches end up in the fallback (0.5³ = 0.125).
Fallback method conventions
- Same Role class as the action it covers (the resolver only scans the role's own methods).
- Method name is irrelevant —
@Fallback.forActionis the binding key. - Argument list: original action's
@Params in declaration order, followed by a trailingThrowableorExceptionparameter for the captured failure. - Multiple fallbacks: a Role can declare any number of
@Fallback(forAction = ...)for different actions. Action-named fallbacks beat the global@Fallback(noforAction) catch-all. - Method-name conflicts: multiple
@Fallback(forAction = "X")on the same role logs a warning and keeps the first one for stability (Phase 2 will fail-fast at action discovery).
Wiring the role into an Agent
Agent agent = AgentBuilder.create()
.role(new PaymentRole())
.llm(new OpenAIClient("gpt-4o-mini"))
.build();
Object result = agent.executeAction(
"charge",
Map.of("requestId", "req-001", "amount", 100));
// result is either "CHARGED:req-001:100" (primary or retry success)
// or "QUEUED:req-001:100 (reason: Payment API unavailable)" (fallback recovery)
// — caller never sees the underlying RuntimeException.No imperative wiring needed — the decorator stack is auto-applied based on the action's annotations.
What's enforced today
| Annotation field | Phase 1 | Phase 2+ |
|---|---|---|
@Traced.operationName | ✓ | — |
@Traced.includeArgs / includeResult | ✓ | — |
@Metered.name | ✓ | — |
@Metered.tags (static k=v) | Captured but Phase-1 sink doesn't use them | Routed via Sink SPI |
@Metered.histogram | ✓ (powers-of-2 buckets) | Configurable bucket layout |
@Fallback.forAction | ✓ | — |
@Fallback.maxRetries | ✓ (immediate retry) | Exponential backoff + jitter |
@RateLimited (token bucket) | Phase 2 (needs distributed-state SPI) | |
@Resilience(circuitBreaker = ...) (sliding window) | Phase 2 (already in ResilienceExecutor at agent level) | |
@Idempotent (keyed cache) | Phase 2 (coordinates with #91) |
@Metered registry is in-memory (singleton); @Traced emits via SLF4J only. Phase 2 ships:
MetricsSinkSPI → Micrometer / Prometheus / OpenTelemetry- OpenTelemetry SPI handoff for
@Traced
Where it sits in the pipeline
Inside ActionExecutor.executeInternal:
1. Approval check
2. Security access control + parameter encryption
3. @BeforeAction transformation
4. ActionValidator.validateParameters
5. ActionContract.validateInvariants + preconditions
6. Invariant precondition checks
7. @InputGuardrail enforcement (#171 Phase 1)
8. @Retrieval enforcement (#172 Phase 1, optional SPI)
9. ► open @Traced span + start @Metered timer ← #173 Phase 1
10. Primary action execution (LOCAL / WEB_SERVICE / LLM / MCP_TOOL)
└─ on failure: @Fallback.tryRecover (with retries) ← #173 Phase 1
11. @OutputGuardrail enforcement (#171 Phase 1)
12. Postcondition + post-execution invariants
13. record outcome on @Metered + @Traced
14. close @Metered + @Traced in finallyReference
@Tracedannotation —tnsai-core/src/main/java/com/tnsai/annotations/Traced.java@Meteredannotation —tnsai-core/src/main/java/com/tnsai/annotations/Metered.java@Fallbackannotation —tnsai-core/src/main/java/com/tnsai/annotations/Fallback.javaTracedResolver/MeteredResolver/FallbackResolver—tnsai-core/src/main/java/com/tnsai/resilience/ResilienceMetrics(in-memory registry) —tnsai-core/src/main/java/com/tnsai/resilience/ResilienceMetrics.javaPaymentRolecookbook —tnsai-integration/src/main/java/com/tnsai/integration/scop/examples/PaymentRole.javaPaymentRoleResilienceIntegrationTest—tnsai-integration/src/test/java/com/tnsai/integration/scop/examples/