TnsAI
Core

Error Handling

TnsAI.Core provides a structured exception hierarchy rooted in `TnsAIException`. Every exception carries an error code, retryability flag, and suggested retry parameters, enabling automated recovery decisions across the framework.

TnsAIException (Base Class)

All TnsAI exceptions extend TnsAIException, which itself extends RuntimeException (unchecked).

public class TnsAIException extends RuntimeException {
    public boolean isRetryable();
    public String getErrorCode();
    public long getSuggestedRetryDelayMs();
    public int getMaxRetryAttempts();
}
MethodDescription
isRetryable()true for transient errors (network, rate limits, server errors)
getErrorCode()Auto-derived code in format TNSAI-CLASSNAME (e.g., TNSAI-NETWORK, TNSAI-RATELIMIT)
getSuggestedRetryDelayMs()Default 1000ms for retryable, 0 for non-retryable. Subclasses override with specific delays.
getMaxRetryAttempts()Default 3 for retryable, 0 for non-retryable

The error code is derived from the class name using Locale.ROOT:

// TnsAIException -> "TNSAI-TNSAI"
// LLMException   -> "TNSAI-LLM"
// NetworkException -> "TNSAI-NETWORK"

LLMException

When something goes wrong during a call to an LLM provider (invalid API key, context too long, server outage), the framework throws an LLMException. Each exception carries an ErrorType that tells you exactly what happened and whether it makes sense to retry.

public class LLMException extends TnsAIException {
    public String getModel();
    public ErrorType getErrorType();
    public long getSuggestedRetryDelayMs();
}

ErrorType Enum

The ErrorType enum classifies the root cause of an LLM failure. Use it to decide how your application should react -- for example, retrying a transient SERVER_ERROR but surfacing a permanent AUTHENTICATION_FAILED to the user.

ValueRetryableDescription
MODEL_NOT_FOUNDNoModel not found or unavailable
AUTHENTICATION_FAILEDNoInvalid API key or auth failure
CONTENT_FILTEREDNoContent policy violation
INVALID_REQUESTNoInvalid request format
MODEL_OVERLOADEDYesModel overloaded, try again
CONTEXT_TOO_LONGNoContext length exceeded
SERVER_ERRORYesGeneric server error
MALFORMED_TOOL_CALLNoBad JSON or missing fields in tool call from model
CAPABILITY_MISMATCHNoModel does not support a required capability
UNKNOWNYesUnknown error

Retry delays are error-type specific:

  • MODEL_OVERLOADED -- 5000ms
  • SERVER_ERROR -- 2000ms
  • All others -- 1000ms

Factory Methods

Instead of calling the constructor directly, use these static factory methods to create LLMException instances with the correct ErrorType and retry parameters already set.

LLMException.modelNotFound("gpt-5")
LLMException.authenticationFailed("claude-sonnet-4", "Invalid API key")
LLMException.contentFiltered("gpt-4o", "Violates content policy")
LLMException.contextTooLong("gpt-4o-mini", 128000, 150000)
LLMException.modelOverloaded("claude-sonnet-4")
LLMException.serverError("gemini-2.5-flash", cause)
LLMException.malformedToolCall("gpt-4o", "search", "Invalid JSON in arguments", cause)
LLMException.malformedToolCall("gpt-4o", "search", "Missing required field 'query'")

RateLimitException

LLM providers enforce request quotas and return HTTP 429 ("Too Many Requests") when you exceed them. RateLimitException wraps these responses and is always retryable, carrying the provider-suggested wait time so your code can back off automatically.

public class RateLimitException extends TnsAIException {
    public Long getRetryAfterMs();
    public String getService();
    public Integer getRemainingQuota();
    public long getSuggestedRetryDelayMs();  // Uses retryAfterMs, defaults to 60000ms
    public int getMaxRetryAttempts();        // Returns 5
}

Factory Methods

These factory methods create RateLimitException instances from common rate-limit scenarios, automatically setting the correct retry delay and max retry count.

// Parse HTTP 429 Retry-After header (seconds -> ms conversion)
RateLimitException.fromHttp429("openai", "30")

// LLM quota exceeded (defaults to 300000ms / 5 minutes)
RateLimitException.llmQuotaExceeded("claude-sonnet-4")

// API endpoint rate limit with explicit delay
RateLimitException.apiRateLimit("/api/v1/chat", 10000L)

ActionExecutionException

When an agent action fails at runtime (a web service call times out, a parameter is missing, an MCP tool returns an error), the framework throws an ActionExecutionException. It includes the action name, type, and error category so you can programmatically decide whether to retry, fix parameters, or escalate.

public class ActionExecutionException extends TnsAIException {
    public String getActionName();
    public ActionType getActionType();
    public ErrorCategory getCategory();
    public String getDetailedMessage();
}

ErrorCategory Enum

Each ActionExecutionException is tagged with an ErrorCategory that groups the failure by root cause. This makes it straightforward to write a switch block that handles transient network errors differently from permanent validation errors.

CategoryRetryable (default)Description
NETWORKYesConnection timeout, DNS failure
PARAMETERNoMissing parameter, wrong type
CLIENT_ERRORNoHTTP 4xx status codes
SERVER_ERRORYesHTTP 5xx status codes
VALIDATIONNoContract violations
LLMYesModel errors, quota exceeded
MCPYesMCP tool errors
INVOCATIONNoReflection, method not found
UNKNOWNNoUnclassified errors

The getDetailedMessage() method produces a structured log line:

[WEB_SERVICE] Action 'fetchWeather' failed: Network error | Category: Network error | Retryable: true | Cause: SocketTimeoutException (Connect timed out)

Factory Methods

Use these static factories to create ActionExecutionException instances with the correct category, retryability flag, and detailed message already populated.

ActionExecutionException.fromNetworkError("fetchWeather", ActionType.WEB_SERVICE, ioException)
ActionExecutionException.fromParameterError("search", ActionType.LOCAL, "query is required", cause)
ActionExecutionException.fromApiError("createIssue", ActionType.WEB_SERVICE, 503, "Service Unavailable", cause)
ActionExecutionException.fromLLMError("summarize", ActionType.LLM, llmException)
ActionExecutionException.fromMCPError("mcp-tool-name", mcpException)
ActionExecutionException.fromInvocationError("calculate", ActionType.LOCAL, reflectionException)

Other Exceptions

Beyond the main exception types above, TnsAI provides several specialized exceptions for network failures, timeouts, validation errors, capability mismatches, and control-flow signals. The table below summarizes their retry behavior and key fields.

ExceptionRetryableRetry DelayMax RetriesKey Fields
NetworkExceptionYes2000ms5host, port
TimeoutExceptionYesmin(timeoutMs/2, 5000)3timeoutMs, operation
ValidationExceptionNo------
ApprovalRequiredExceptionNo----actionName, reason
TaskCompleteExceptionNo----summary, result, success, metadata
LLMCapabilityExceptionNo----provider, capability
ToolCallNotSupportedExceptionNo----model, provider

NetworkException Factories

Create NetworkException instances for common connectivity failures like refused connections, DNS resolution problems, and connection timeouts.

NetworkException.connectionRefused("api.example.com", 443)
NetworkException.dnsResolutionFailed("api.example.com", cause)
NetworkException.connectionTimeout("api.example.com", 443, 5000L)

TimeoutException Factories

Create TimeoutException instances for operations that exceed their time budget, whether that is an LLM call, an HTTP request, or an action execution.

TimeoutException.llmTimeout(30000L)
TimeoutException.httpTimeout("https://api.example.com/v1/chat", 10000L)
TimeoutException.actionTimeout("fetchData", 5000L)

LLMCapabilityException Factories

Thrown when you request a feature (streaming, vision, structured output) that the selected model or provider does not support. These are never retryable because the model simply lacks the capability.

LLMCapabilityException.streamingNotSupported("phi", "Ollama")
LLMCapabilityException.visionNotSupported("gpt-3.5-turbo", "OpenAI")
LLMCapabilityException.structuredOutputNotSupported("llama-2", "Ollama")

TaskCompleteException (Control Flow)

TaskCompleteException is not an error -- it is a control flow signal used to indicate that a task has been completed and the agent loop should terminate.

// Simple completion
throw new TaskCompleteException("Analysis complete");

// With result data
throw TaskCompleteException.withResult("Task done", Map.of("filesCreated", 5));

// Failed completion
throw TaskCompleteException.failed("Could not complete", "API unavailable");

// With metadata
throw TaskCompleteException.withMetadata("Done", Map.of("duration", "45s"));

Handling in the agent loop:

try {
    agent.run(task);
} catch (TaskCompleteException e) {
    System.out.println("Summary: " + e.getSummary());
    System.out.println("Success: " + e.isSuccess());
    if (e.hasResult()) {
        MyResult result = e.getResultAs(MyResult.class);
    }
}

Code Examples

These examples show common patterns for handling TnsAI exceptions in your application code.

Catching and Classifying Errors

The recommended approach is to catch exceptions from most specific to least specific. This lets you handle rate limits, LLM-specific errors, and generic TnsAI errors each in the most appropriate way.

try {
    String response = agent.chat("Analyze this data");
} catch (RateLimitException e) {
    // Wait for the provider-specified delay
    Thread.sleep(e.getSuggestedRetryDelayMs());
    // Retry...
} catch (LLMException e) {
    if (e.getErrorType() == LLMException.ErrorType.CONTEXT_TOO_LONG) {
        // Truncate context and retry
    } else if (e.isRetryable()) {
        // Retry with backoff
    } else {
        // Log and fail
        logger.error("LLM error [{}]: {}", e.getErrorCode(), e.getMessage());
    }
} catch (TnsAIException e) {
    if (e.isRetryable()) {
        logger.warn("Retryable error [{}], retrying in {}ms",
            e.getErrorCode(), e.getSuggestedRetryDelayMs());
    } else {
        logger.error("Non-retryable error [{}]: {}", e.getErrorCode(), e.getMessage());
    }
}

Handling Action Execution Errors

When an action fails, you can use the error category to decide on recovery. Transient errors (network, server) can be retried with backoff, while parameter or validation errors need to be fixed before retrying.

try {
    executor.execute(action, params);
} catch (ActionExecutionException e) {
    logger.error(e.getDetailedMessage());

    switch (e.getCategory()) {
        case NETWORK, SERVER_ERROR -> {
            // Transient -- retry with backoff
            Thread.sleep(e.getSuggestedRetryDelayMs());
        }
        case PARAMETER, VALIDATION -> {
            // Fix parameters and retry
            logger.warn("Fix parameters for action: {}", e.getActionName());
        }
        case LLM -> {
            // Check nested LLMException for details
            if (e.getCause() instanceof LLMException llm) {
                logger.warn("LLM error type: {}", llm.getErrorType());
            }
        }
        default -> throw e;
    }
}

Using Error Codes for Monitoring

Every TnsAIException carries a stable error code (like TNSAI-NETWORK or TNSAI-RATELIMIT) that you can use as a metric tag in your monitoring system. This example shows how to increment a counter on each failure for dashboards and alerting.

try {
    agent.chat("query");
} catch (TnsAIException e) {
    metrics.counter("tnsai.errors",
        "code", e.getErrorCode(),
        "retryable", String.valueOf(e.isRetryable())
    ).increment();
    throw e;
}

On this page