Observability

Quick Start

TnsAI provides structured logging out of the box. Create a logger for your agent and attach key-value metadata to every log entry for easy filtering and searching in your logging backend.

// Structured logging
AgentLogger logger = AgentLogger.forAgent("agent-001");
logger.info("Processing request")
    .with("userId", userId)
    .with("action", "search")
    .log();

OpenTelemetry Tracing

Distributed tracing with automatic span management:

TnsAITelemetry telemetry = TnsAITelemetry.builder()
    .serviceName("my-agent-service")
    .serviceVersion("1.0.0")
    .otlpEndpoint("http://localhost:4317")
    .build();

OpenTelemetryTracer tracer = new OpenTelemetryTracer(telemetry.getTracer());

try (SpanScope scope = tracer.startSpan("process-request")) {
    scope.setAttribute("user.id", userId);
    scope.setAttribute("request.type", "search");
    // ... work happens here ...
    scope.setSuccess();
} // Span automatically closed

// Convenience wrapper
String result = tracer.trace("fetch-data", () -> fetchData(query));

TnsAI provides pre-defined counters and histograms that track agent actions, LLM calls, token usage, and latency. These integrate with OpenTelemetry and can be exported to any compatible metrics backend.

OpenTelemetryMetrics metrics = new OpenTelemetryMetrics(meter);

// Record agent actions
metrics.recordActionSuccess("search", 150);  // type, duration ms
metrics.recordActionFailure("write", 50, "timeout");

// Record LLM calls
metrics.recordLlmCall("openai", "gpt-4o", 1000, 500, 2300);  // provider, model, tokens, latency
metrics.recordLlmCallFailure("anthropic", "claude-3", 1500, "rate_limit");

// Gauges
metrics.setActiveAgents(5);

Available metrics:

Metric	Type	Description
`tnsai.agent.actions.total`	Counter	Total actions executed
`tnsai.agent.actions.success`	Counter	Successful actions
`tnsai.agent.actions.failed`	Counter	Failed actions
`tnsai.agent.action.duration`	Histogram	Action execution time (ms)
`tnsai.llm.calls.total`	Counter	Total LLM calls
`tnsai.llm.tokens.prompt`	Counter	Input tokens consumed
`tnsai.llm.tokens.completion`	Counter	Output tokens generated
`tnsai.llm.latency`	Histogram	LLM response time (ms)
`tnsai.agent.active`	Gauge	Currently active agents

Structured Logging

The AgentLogger provides a fluent API for emitting structured log entries with arbitrary key-value metadata. It also includes convenience methods for common agent events like action starts, completions, and LLM calls.

AgentLogger logger = AgentLogger.forAgent("agent-001");

// Fluent API
logger.info("Tool executed")
    .with("tool", "brave_search")
    .with("duration", 350)
    .log();

// Convenience methods
logger.logActionStart("searchPapers", params);
logger.logActionComplete("searchPapers", 1200, true);
logger.logLLMCall("gpt-4o", 500, 2100);

// Timed execution
String result = logger.timed("database-query", () -> db.query(sql));

Themed Logging

Themed loggers wrap SLF4J and add fun, themed prefixes to log messages based on the event type. 7 built-in themes, 22 event types, custom theme support.

// Create with theme
ThemedLogger logger = ThemedLogger.create("my-agent", LogThemes.STAR_WARS);

// Log events — each gets a themed prefix
logger.actionStart("Searching for documents");
// Output: [LIGHTSABER ON] Searching for documents

logger.actionComplete("Found 5 documents");
// Output: [THE FORCE SUCCEEDS] Found 5 documents

logger.error("Connection failed");
// Output: [SITH LORD DETECTED] Connection failed

// Startup/shutdown banners
logger.logStartup();
// Output: A long time ago in a galaxy far, far away... TnsAI awakens.

Factory methods:

// With explicit theme
ThemedLogger logger = ThemedLogger.create("agent-001", LogThemes.LOTR);

// Default theme (no prefix)
ThemedLogger logger = ThemedLogger.create("agent-001");

// From class name
ThemedLogger logger = ThemedLogger.forClass(MyAgent.class, LogThemes.MATRIX);

Available themes:

Theme	Name	Example prefix
`LogThemes.DEFAULT`	`"default"`	(no prefix)
`LogThemes.STAR_WARS`	`"starwars"`	`[LIGHTSABER ON]`, `[CONSULTING YODA]`
`LogThemes.LOTR`	`"lotr"`	`[QUEST BEGINS]`, `[CONSULTING GANDALF]`
`LogThemes.MATRIX`	`"matrix"`	`[ENTERING THE MATRIX]`, `[CALLING ORACLE]`
`LogThemes.PIRATE`	`"pirate"`	`[SETTING SAIL]`, `[CONSULTING DAVY JONES]`
`LogThemes.TURKISH`	`"turkish"`	`[IS BASLIYOR]`, `[AKIL DANIYOR]`
`LogThemes.EMOJI`	`"emoji"`	Lightning, Robot, Trophy emojis

22 event types (LogEvent enum):

Group	Events
Agent lifecycle	`AGENT_START`, `AGENT_STOP`, `AGENT_IDLE`
Actions	`ACTION_START`, `ACTION_COMPLETE`, `ACTION_ERROR`
Planning	`PLAN_START`, `PLAN_COMPLETE`, `GOAL_ACHIEVED`, `GOAL_FAILED`
LLM	`LLM_CALL`, `LLM_RESPONSE`, `LLM_ERROR`, `TOOL_CALL`
Communication	`MESSAGE_SENT`, `MESSAGE_RECEIVED`
State	`STATE_CHANGE`, `BELIEF_UPDATE`
General	`INFO`, `WARNING`, `ERROR`, `DEBUG`, `SUCCESS`, `FAILURE`

Custom themes and runtime switching:

// Implement LogTheme interface
LogTheme custom = new LogTheme() {
    public String name() { return "custom"; }
    public String format(LogEvent event, String message) {
        return "[MY-PREFIX] " + message;
    }
};

// Register for lookup by name
LogThemes.register(custom);
LogTheme theme = LogThemes.get("custom");

// Change theme at runtime
logger.setTheme(LogThemes.PIRATE);

// List and check themes
String[] names = LogThemes.listThemes();
boolean exists = LogThemes.hasTheme("starwars");

Prometheus Exporter

If you use Prometheus for monitoring, this exporter spins up a lightweight HTTP server that serves metrics in the Prometheus text format. It syncs with the Metrics singleton and exposes all predefined agent and LLM counters.

// Create and start
PrometheusMetricsExporter exporter = PrometheusMetricsExporter.builder()
    .port(9090)
    .build();
exporter.start();
// Metrics available at http://localhost:9090/metrics

// Record metrics
exporter.recordActionSuccess("search", 150);       // type, duration ms
exporter.recordActionFailure("write", "timeout", 50);
exporter.recordLlmCall("openai", "gpt-4o", 1000, 500, 2300);
exporter.recordLlmCallFailure("anthropic", "claude-3", "rate_limit", 1500);
exporter.setActiveAgents(5);

// One-liner startup
PrometheusMetricsExporter exporter = PrometheusMetricsExporter.createAndStart();

// Shutdown
exporter.stop();

Environment variables:

Variable	Default	Description
`TNSAI_PROMETHEUS_PORT`	`9090`	HTTP server port
`TNSAI_PROMETHEUS_ENABLED`	`true`	Enable/disable exporter

// Configure from environment
PrometheusMetricsExporter exporter = PrometheusMetricsExporter.builder()
    .fromEnvironment()
    .build();

Predefined Prometheus metrics:

Metric	Type	Labels
`tnsai_agent_actions_total`	Counter	`status`
`tnsai_agent_actions_success_total`	Counter	--
`tnsai_agent_actions_failed_total`	Counter	`error_type`
`tnsai_llm_calls_total`	Counter	`provider`, `model`, `status`
`tnsai_llm_calls_failed_total`	Counter	`provider`, `model`, `error_type`
`tnsai_llm_prompt_tokens_total`	Counter	`provider`, `model`
`tnsai_llm_completion_tokens_total`	Counter	`provider`, `model`
`tnsai_agent_active`	Gauge	--
`tnsai_agent_action_duration_seconds`	Histogram	`action_type`
`tnsai_llm_latency_seconds`	Histogram	`provider`, `model`

Custom metrics:

Counter counter = exporter.getOrCreateCounter("my_counter", "My help text", "label1");
Gauge gauge = exporter.getOrCreateGauge("my_gauge", "My help text", "label1");
Histogram histogram = exporter.getOrCreateHistogram("my_histogram", "My help text", "label1");

LLM Instrumentation

This module provides OpenTelemetry instrumentation specifically for LLM calls. It follows the GenAI semantic conventions, which means your traces are compatible with observability tools like Jaeger, Zipkin, and Grafana Tempo without any custom mapping.

LlmInstrumentation llmInstr = new LlmInstrumentation(telemetry.getTracer());

// Trace a chat completion
ChatResponse response = llmInstr.traceChat("openai", "gpt-4", () -> {
    return client.chat(message);
});

// With request parameters
ChatResponse response = llmInstr.traceChat("openai", "gpt-4", 4096, 0.7, () -> {
    return client.chat(message);
});

// Trace streaming
Stream<ChatChunk> stream = llmInstr.traceStreamChat("anthropic", "claude-3", () -> {
    return client.streamChat(message);
});

// Trace embeddings
List<float[]> embeddings = llmInstr.traceEmbeddings("openai", "text-embedding-3", 10, () -> {
    return client.embed(inputs);
});

GenAI semantic convention attributes:

Attribute	Description
`gen_ai.system`	LLM provider (openai, anthropic, etc.)
`gen_ai.request.model`	Model name
`gen_ai.request.max_tokens`	Max tokens requested
`gen_ai.request.temperature`	Temperature setting
`gen_ai.request.top_p`	Top-p setting
`gen_ai.usage.prompt_tokens`	Prompt tokens used
`gen_ai.usage.completion_tokens`	Completion tokens generated
`gen_ai.usage.total_tokens`	Total tokens
`gen_ai.response.finish_reasons`	Finish reason (stop, length, error)
`gen_ai.operation.name`	Operation type (chat, stream_chat, embeddings)

TnsAI-specific attributes:

Attribute	Description
`tnsai.agent.id`	Agent identifier
`tnsai.agent.name`	Agent name
`tnsai.message.length`	Input message length
`tnsai.tool_use.enabled`	Whether tool use is enabled
`tnsai.tool_calls.count`	Number of tool calls

Recording token usage and tool events on a span:

Span span = Span.current();
llmInstr.recordTokenUsage(span, 150, 50);
llmInstr.recordResponse(span, 150, 50, "stop");
llmInstr.recordToolUse(span, 3);
llmInstr.recordToolCallEvent(span, "brave_search", true);

Span builder for full control:

Span span = llmInstr.spanBuilder("openai", "gpt-4")
    .operation("chat")
    .maxTokens(4096)
    .temperature(0.7)
    .topP(0.9)
    .agent("agent-001", "ResearchAgent")
    .messageLength(500)
    .toolUseEnabled(true)
    .start();

Agent Instrumentation

Agent instrumentation adds tracing and metrics to agent operations without modifying the Agent class itself. It wraps calls like chat, action execution, and tool calls with OpenTelemetry spans, giving you end-to-end visibility into what your agents are doing.

// Initialize
AgentInstrumentation instrumentation = new AgentInstrumentation(telemetry);
// Or with custom tracer/metrics
AgentInstrumentation instrumentation = new AgentInstrumentation(tracer, metrics);

// Register agents (increments active agent gauge)
instrumentation.registerAgent(agent);

Traced operations:

// Trace chat
String response = instrumentation.traceChat(agent, "Hello", () -> {
    return agent.chat("Hello");
});

// Trace action execution
Object result = instrumentation.traceAction(agent, "searchPapers", params, () -> {
    return agent.executeAction("searchPapers", params);
});

// Trace tool call
String output = instrumentation.traceToolCall(agent, "brave_search", args, () -> {
    return tool.execute(args);
});

// Trace LLM call with GenAI conventions
ChatResponse resp = instrumentation.traceLlmCall(agent, "openai", "gpt-4", () -> {
    return llmClient.chat(messages);
});

Span attributes per operation:

Operation	Attributes
`agent.chat`	`agent.id`, `agent.name`, `message.length`, `message.preview`
`agent.action`	`agent.id`, `agent.name`, `action.name`, `action.parameters.count`
`agent.tool_call`	`agent.id`, `tool.name`, `tool.arguments.count`
`agent.llm_call`	`agent.id`, `gen_ai.system`, `gen_ai.request.model`, `gen_ai.operation.name`

Recording metrics directly:

instrumentation.recordLlmTokens("openai", "gpt-4", 150, 50, 2300);
instrumentation.recordLlmFailure("openai", "gpt-4", 1500, "rate_limit");
instrumentation.unregisterAgent(agent);  // decrements active agent gauge

Agent-Specific Tracing

While the above instrumentation gives you OpenTelemetry-level visibility, agent-specific tracing goes deeper. It captures the full execution trace of a single agent run, including every LLM call, tool invocation, and guardrail check, along with quality scores you attach manually or automatically.

AgentTrace

An AgentTrace represents a complete execution recording for one agent run. You start a trace, add observations as the agent works, and then complete it. This is the foundation for evaluation and debugging.

AgentTrace trace = AgentTrace.start("agent-001", "research-task");

// Add observations during execution
trace.addObservation(Observation.span("llm-call")
    .input("Summarize the document")
    .output("The document describes...")
    .duration(Duration.ofMillis(2300))
    .metadata(Map.of("model", "claude-sonnet-4", "tokens", 1500))
    .build());

trace.addObservation(Observation.generation("tool-call")
    .input(Map.of("tool", "brave_search", "query", "quantum computing"))
    .output("Results: ...")
    .build());

trace.addObservation(Observation.event("guardrail-check")
    .metadata(Map.of("guardrail", "pii-filter", "passed", true))
    .build());

trace.complete();

Observation Types

Observations are the building blocks of a trace. Each one records a specific event or operation during the agent's execution, and they come in three types depending on what happened.

Type	Constant	Description
`SPAN`	`Observation.span()`	Timed execution block (LLM call, tool execution)
`GENERATION`	`Observation.generation()`	LLM text generation with input/output
`EVENT`	`Observation.event()`	Point-in-time event (guardrail check, state change)

Each observation supports: input, output, duration, metadata, parentId (for nesting), and status.

Score

Scores let you attach quality measurements to a trace. You can use them for manual evaluation (did the agent answer correctly?) or automated evaluation (did the output contain PII?). Scores come in three types: numeric, boolean, and categorical.

// Numeric score
trace.addScore(Score.numeric("relevance", 0.92));

// Boolean score
trace.addScore(Score.bool("contains_pii", false));

// Categorical score
trace.addScore(Score.categorical("sentiment", "positive"));

Factory Method	Score Type	Value
`Score.numeric(name, value)`	Numeric	0.0-1.0 double
`Score.bool(name, value)`	Boolean	true/false
`Score.categorical(name, value)`	Categorical	String category

TraceContext (ThreadLocal Nesting)

TraceContext uses a ThreadLocal to track which trace is active on the current thread. This means you can start nested spans anywhere in your code and they automatically link to the parent span, without passing the trace object through every method call.

// Set current trace for this thread
TraceContext.set(trace);

// Get current trace (returns Optional)
AgentTrace current = TraceContext.current().orElseThrow();

// Nested spans auto-link to parent
try (var scope = TraceContext.startSpan("sub-operation")) {
    // This span's parentId is automatically set to the enclosing span
    doWork();
}

// Clear when done
TraceContext.clear();

Guardrail SPI

Guardrails are safety checks that evaluate agent behavior during or after execution. Implement the Guardrail interface to define your own checks, such as ensuring the agent does not leak sensitive data or produce harmful content.

public interface Guardrail {
    GuardrailResult evaluate(AgentTrace trace);
    String name();
}

PiiGuardrail

The PiiGuardrail is a built-in guardrail that scans agent outputs for personally identifiable information such as email addresses, phone numbers, and social security numbers. Use it to prevent your agents from accidentally leaking sensitive data.

PiiGuardrail piiGuard = new PiiGuardrail();
GuardrailResult result = piiGuard.evaluate(trace);

if (!result.passed()) {
    System.out.println("PII detected: " + result.findings());
    // Findings include: type (EMAIL, PHONE, SSN, etc.), location, severity
}

Wire guardrails into the agent:

Agent agent = AgentBuilder.create()
    .model("claude-sonnet-4")
    .guardrail(new PiiGuardrail())
    .build();

Health Checks

Health checks let you monitor the operational status of your application's components (database, cache, LLM providers, etc.). TnsAI provides an aggregated health registry that runs checks concurrently, handles timeouts gracefully, and produces HTTP-ready summaries suitable for load balancer probes.

HealthStatus levels:

Status	Severity	Description
`UP`	0	Component functioning normally
`DEGRADED`	1	Functioning with reduced capability
`UNKNOWN`	2	Status cannot be determined
`DOWN`	3	Not functioning

HealthCheckResult factory methods:

HealthCheckResult.up()                        // healthy
HealthCheckResult.up("All good")              // healthy with message
HealthCheckResult.down()                      // unhealthy
HealthCheckResult.down("Connection refused")  // unhealthy with message
HealthCheckResult.down(exception)             // unhealthy from exception
HealthCheckResult.degraded("High latency")    // degraded
HealthCheckResult.unknown("Check timed out")  // unknown
HealthCheckResult.of(status, message)         // any status

// Add details (immutable — returns new instance)
HealthCheckResult result = HealthCheckResult.up()
    .withDetail("connections", 10)
    .withDetail("latency_ms", 50)
    .withDuration(duration);

// Inspect
result.getStatus();         // HealthStatus enum
result.getMessage();        // String or null
result.getDetails();        // Map<String, Object>
result.isHealthy();         // true if UP or DEGRADED
result.getDuration();       // Duration or null
result.toMap();             // Map for JSON serialization
result.combine(other);      // returns result with worse status

HealthIndicator interface:

// Implement for custom components
public class DatabaseHealth implements HealthIndicator {
    public String getName() { return "database"; }
    public HealthCheckResult check() {
        return conn.isValid(5) ? HealthCheckResult.up() : HealthCheckResult.down();
    }
}

// Lambda shorthand — from boolean
HealthIndicator simple = HealthIndicator.of("cache", () -> cache.isConnected());

// Lambda shorthand — from supplier
HealthIndicator detailed = HealthIndicator.of("cache", () -> {
    return HealthCheckResult.up().withDetail("size", cache.size());
});

// Async and timeout support (default methods)
CompletableFuture<HealthCheckResult> future = indicator.checkAsync();
HealthCheckResult result = indicator.checkWithTimeout(Duration.ofSeconds(2));

HealthRegistry:

HealthRegistry registry = HealthRegistry.getInstance();  // singleton

// Register indicators
registry.register(new ResourceHealthIndicator());
registry.register("custom", HealthIndicator.of("custom", () -> true));

// Check health
Optional<HealthCheckResult> single = registry.check("database");
HealthCheckResult all = registry.checkAll();                          // default 30s timeout
HealthCheckResult all = registry.checkAll(Duration.ofSeconds(10));    // custom timeout
CompletableFuture<HealthCheckResult> async = registry.checkAllAsync();
HealthCheckResult matched = registry.checkMatching("db-*");           // wildcard

// Quick status (2s timeout per indicator)
HealthStatus status = registry.getQuickStatus();  // UP, DOWN, DEGRADED, or UNKNOWN

// HTTP-ready summary
Map<String, Object> summary = registry.getHealthSummary();
// { "status": "UP", "timestamp": "...", "totalDurationMs": 45,
//   "components": { "database": {...}, "cache": {...} } }

// Management
registry.getIndicatorNames();  // Set<String>
registry.getIndicatorCount();  // int
registry.unregister("old");
registry.clear();
registry.shutdown();           // call on app shutdown

ResourceHealthIndicator -- monitors heap memory, CPU load, threads, and deadlocks:

// Default thresholds (85% memory, 90% CPU)
registry.register(new ResourceHealthIndicator());

// Custom thresholds
registry.register(new ResourceHealthIndicator(0.9, 0.9));

// Direct queries
ResourceHealthIndicator res = new ResourceHealthIndicator();
double heapUsage = res.getHeapUsageRatio();   // 0.0-1.0
int threads = res.getThreadCount();
boolean deadlock = res.hasDeadlock();

Reports details: heap.used.mb, heap.max.mb, heap.usage.percent, nonHeap.used.mb, cpu.availableProcessors, cpu.systemLoadAverage, cpu.loadPerProcessor, threads.current, threads.peak, threads.deadlocked. Returns DEGRADED on high memory/CPU, DOWN on deadlock.