TnsAI
Tutorials

Tutorial: Declarative resilience with @Traced, @Metered, and @Fallback

Wrap every action dispatch with structured tracing, latency + counter metrics, and automatic recovery on failure — through annotations alone. The framework's ActionExecutor applies the decorators in canonical order; your action body stays focused on the business logic.

Prerequisites

  • Installation
  • A Role with at least one @ActionSpec method that calls something flaky (external API, slow disk, untrusted input)

The role

Full source lives at tnsai-integration/src/main/java/com/tnsai/integration/scop/examples/PaymentRole.java and is exercised by PaymentRoleResilienceIntegrationTest. The shape:

@RoleSpec(
    name = "PaymentAgent",
    description = "Charges accounts via a flaky external API; falls back to queueing on failure.",
    responsibilities = {
        @Responsibility(name = "Payments", actions = {"charge"})
    }
)
public class PaymentRole extends Role {
    // ... primary action + fallback method below
}

The decorator stack

For each annotated action, ActionExecutor.executeInternal wraps the dispatch like this:

open Traced span      ← MDC trace id + structured span.start log

start Metered timer   ← record latency + counter on close

PRIMARY EXECUTION     ← your action body (or LOCAL/WEB_SERVICE/LLM/MCP executor)

on success → record success on metered + traced, return result
on failure → invoke @Fallback (with retries), record outcome,
             return fallback's value or rethrow

finally: close metered (record entry) + traced (emit span.end)

Every annotation is independent — declare any subset.

Pattern 1 — @Traced for span-style observability

@ActionSpec(type = ActionType.LOCAL, description = "Charge a customer's account.")
@Traced(operationName = "payment.charge", includeArgs = true)
public String charge(
    @Param(name = "requestId", description = "Idempotency token") String requestId,
    @Param(name = "amount",    description = "Amount in minor units (cents)") int amount
) {
    // ...
}

What happens at runtime:

  • A trace id (UUID) is pushed onto the SLF4J MDC under the key traceId. Every log line emitted from this dispatch (and any nested logger calls) carries the same id, making post-hoc grep correlation trivial.
  • Two structured log lines bracket the dispatch:
    • span.start op=payment.charge traceId=... args={requestId=req-1, amount=100}
    • span.end op=payment.charge traceId=... duration_ms=42 status=success
  • On failure: span.end ... status=failure error=RuntimeException
  • Nested traced calls within the same dispatch reuse the outer span's trace id (no inner-evicted-by-outer issue).

includeArgs and includeResult default to false — opt in only when you need the data in the log (PII risk).

Pattern 2 — @Metered for counter + latency histogram

@Metered(name = "payment.charge", histogram = true)
public String charge(...) { ... }

The framework records each invocation in an in-memory ResilienceMetrics registry:

  • Counters: total, success, failure (LongAdder under the hood for hot-path performance).
  • Histogram (when histogram = true): powers-of-2 latency buckets — 1ms, 5ms, 10ms, 25ms, 50ms, 100ms, 250ms, 500ms, 1s, 2.5s, 5s, 10s, 30s, +∞.
  • Default name: ClassName.methodName when name is empty.
  • Class-level inheritance: @Metered on the Role class applies to every action that doesn't declare its own (method-level wins entirely — no merging).

Tests can snapshot the registry to assert metrics fired:

ResilienceMetrics.MetricSnapshot snap = ResilienceMetrics.getInstance()
    .snapshot("payment.charge").orElseThrow();
assertEquals(50, snap.total());
assertTrue(snap.successRate() > 0.4);

Phase 2 will add a MetricsSink SPI for routing the same emissions to Micrometer / Prometheus / OpenTelemetry. Phase 1 stays in-memory — same boundary as the other Phase-1 batches in this sprint.

Pattern 3 — @Fallback for automatic recovery

@Fallback works differently from the other two decorators: it doesn't wrap the action — it marks a separate method on the same Role as the recovery handler.

@ActionSpec(type = ActionType.LOCAL, description = "Charge a customer's account.")
@Traced(operationName = "payment.charge", includeArgs = true)
@Metered(name = "payment.charge", histogram = true)
public String charge(
    @Param(name = "requestId", description = "Idempotency token") String requestId,
    @Param(name = "amount",    description = "Amount in minor units (cents)") int amount
) {
    // 50% chance of failure (simulated flaky external API)
    if (ThreadLocalRandom.current().nextBoolean()) {
        throw new RuntimeException("Payment API unavailable");
    }
    return "CHARGED:" + requestId + ":" + amount;
}

@Fallback(forAction = "charge", maxRetries = 2)
public String chargeFallback(String requestId, int amount, Throwable failure) {
    return "QUEUED:" + requestId + ":" + amount + " (reason: " + failure.getMessage() + ")";
}

When charge throws, the framework:

  1. Catches the exception at the dispatcher's edge (no propagation to caller).
  2. Retries by re-invoking charge up to maxRetries = 2 additional times.
  3. If all retries also fail, invokes chargeFallback exactly once with (requestId, amount, lastFailure).
  4. Returns the fallback's value as if charge had returned it.

So a single dispatch gets 3 chances at the primary (1 + maxRetries=2) before falling back. With 50% failure rate, only ~12% of dispatches end up in the fallback (0.5³ = 0.125).

Fallback method conventions

  • Same Role class as the action it covers (the resolver only scans the role's own methods).
  • Method name is irrelevant — @Fallback.forAction is the binding key.
  • Argument list: original action's @Params in declaration order, followed by a trailing Throwable or Exception parameter for the captured failure.
  • Multiple fallbacks: a Role can declare any number of @Fallback(forAction = ...) for different actions. Action-named fallbacks beat the global @Fallback (no forAction) catch-all.
  • Method-name conflicts: multiple @Fallback(forAction = "X") on the same role logs a warning and keeps the first one for stability (Phase 2 will fail-fast at action discovery).

Wiring the role into an Agent

Agent agent = AgentBuilder.create()
    .role(new PaymentRole())
    .llm(new OpenAIClient("gpt-4o-mini"))
    .build();

Object result = agent.executeAction(
    "charge",
    Map.of("requestId", "req-001", "amount", 100));

// result is either "CHARGED:req-001:100" (primary or retry success)
// or "QUEUED:req-001:100 (reason: Payment API unavailable)" (fallback recovery)
// — caller never sees the underlying RuntimeException.

No imperative wiring needed — the decorator stack is auto-applied based on the action's annotations.

What's enforced today

Annotation fieldPhase 1Phase 2+
@Traced.operationName
@Traced.includeArgs / includeResult
@Metered.name
@Metered.tags (static k=v)Captured but Phase-1 sink doesn't use themRouted via Sink SPI
@Metered.histogram✓ (powers-of-2 buckets)Configurable bucket layout
@Fallback.forAction
@Fallback.maxRetries✓ (immediate retry)Exponential backoff + jitter
@RateLimited (token bucket)Phase 2 (needs distributed-state SPI)
@Resilience(circuitBreaker = ...) (sliding window)Phase 2 (already in ResilienceExecutor at agent level)
@Idempotent (keyed cache)Phase 2 (coordinates with #91)

@Metered registry is in-memory (singleton); @Traced emits via SLF4J only. Phase 2 ships:

  • MetricsSink SPI → Micrometer / Prometheus / OpenTelemetry
  • OpenTelemetry SPI handoff for @Traced

Where it sits in the pipeline

Inside ActionExecutor.executeInternal:

1. Approval check
2. Security access control + parameter encryption
3. @BeforeAction transformation
4. ActionValidator.validateParameters
5. ActionContract.validateInvariants + preconditions
6. Invariant precondition checks
7. @InputGuardrail enforcement (#171 Phase 1)
8. @Retrieval enforcement (#172 Phase 1, optional SPI)
9. ► open @Traced span + start @Metered timer    ← #173 Phase 1
10. Primary action execution (LOCAL / WEB_SERVICE / LLM / MCP_TOOL)
    └─ on failure: @Fallback.tryRecover (with retries)         ← #173 Phase 1
11. @OutputGuardrail enforcement (#171 Phase 1)
12. Postcondition + post-execution invariants
13. record outcome on @Metered + @Traced
14. close @Metered + @Traced in finally

Reference

  • @Traced annotation — tnsai-core/src/main/java/com/tnsai/annotations/Traced.java
  • @Metered annotation — tnsai-core/src/main/java/com/tnsai/annotations/Metered.java
  • @Fallback annotation — tnsai-core/src/main/java/com/tnsai/annotations/Fallback.java
  • TracedResolver / MeteredResolver / FallbackResolvertnsai-core/src/main/java/com/tnsai/resilience/
  • ResilienceMetrics (in-memory registry) — tnsai-core/src/main/java/com/tnsai/resilience/ResilienceMetrics.java
  • PaymentRole cookbook — tnsai-integration/src/main/java/com/tnsai/integration/scop/examples/PaymentRole.java
  • PaymentRoleResilienceIntegrationTesttnsai-integration/src/test/java/com/tnsai/integration/scop/examples/

On this page