Agent Variants

Agent variants let you trade off between response quality, execution speed, and token cost. A single agent can switch variants at runtime -- per task or per action.

AgentVariant Enum

The four variant tiers represent different quality/speed/cost tradeoffs. Pick the one that matches your task, or use AUTO to let the framework decide at runtime based on task complexity.

Defined in com.tnsai.enums.AgentVariant. Four tiers:

Variant	Quality	Speed	Cost	Best For
`HIGH`	Max (1.0)	Slow (0.3)	High (1.0)	Complex refactoring, security review, architecture
`MEDIUM`	Balanced (0.7)	Normal (0.6)	Medium (0.5)	Regular development, feature implementation
`MINI`	Basic (0.4)	Fast (1.0)	Low (0.2)	Quick fixes, typo corrections, simple queries
`AUTO`	Adaptive	Adaptive	Optimal	Production environments with varied workloads

Helper methods

These convenience methods let you check what a variant prioritizes without comparing enum values directly.

variant.isQualityFocused();  // true for HIGH, MEDIUM
variant.isSpeedFocused();    // true for MINI
variant.isCostOptimized();   // true for MINI, AUTO

If you are not sure which variant to use, forTask() analyzes the task description and suggests one based on keyword matching. This is a simple heuristic -- for smarter auto-selection, use VariantManager with auto mode enabled.

AgentVariant.forTask("Refactor the auth system");  // HIGH
AgentVariant.forTask("Fix a typo in README");       // MINI
AgentVariant.forTask("Implement login page");       // MEDIUM

Keywords that push toward HIGH: refactor, architect, complex, critical, security, review. Keywords that push toward MINI: typo, fix, simple, quick, minor, small.

VariantSpec

Each variant tier has a VariantSpec that defines its concrete settings: which LLM model to use, token limits, available tools, timeout, retry count, and temperature. You can use the built-in specs or build a custom one for your specific models and requirements.

Predefined specs

The built-in specs for each tier ship with sensible defaults for model selection, token limits, and timeouts. These are what you get when you use a variant without customization.

	HIGH	MEDIUM	MINI
Preferred model	`claude-opus-4`	`claude-sonnet-4`	`claude-haiku-3`
Fallback models	`gpt-4`, `gemini-1.5-pro`	`gpt-4o`, `gemini-1.5-flash`	`gpt-4o-mini`, `gemini-1.5-flash-8b`
Max input tokens	128,000	64,000	32,000
Max output tokens	16,384	8,192	4,096
Tool set	ALL	STANDARD	MINIMAL
Timeout	10 min	5 min	2 min
Max retries	3	2	1
Temperature	0.7	0.5	0.3
Streaming	Yes	Yes	No

AUTO defaults to the MEDIUM spec and adjusts dynamically at runtime.

Using predefined specs

Retrieve the built-in spec for a variant tier with VariantSpec.forVariant() and query its settings.

VariantSpec highSpec = VariantSpec.forVariant(AgentVariant.HIGH);
String model = highSpec.getPreferredModel();       // "claude-opus-4"
int inputTokens = highSpec.getMaxInputTokens();    // 128000
Duration timeout = highSpec.getTimeout();           // PT10M

Building a custom spec

When the built-in specs do not match your environment (different models, different limits), build a custom one. Custom specs are immutable -- once built, they cannot be changed.

VariantSpec custom = VariantSpec.builder()
    .variant(AgentVariant.HIGH)
    .preferredModel("claude-opus-4")
    .fallbackModels(List.of("gpt-4", "gemini-1.5-pro"))
    .maxInputTokens(128000)
    .maxOutputTokens(16384)
    .toolSet(VariantSpec.ToolSet.ALL)
    .timeout(Duration.ofMinutes(10))
    .maxRetries(3)
    .temperature(0.7)
    .enableStreaming(true)
    .addSetting("customKey", "value")
    .build();

Model resolution

When the preferred model is unavailable (API outage, not provisioned), getEffectiveModel automatically falls back to the next available model from the fallback list.

Set<String> available = Set.of("gpt-4o", "claude-haiku-3");
String model = highSpec.getEffectiveModel(available);  // "gpt-4o" (fallback)

Immutable copies

Since specs are immutable, changing a field returns a new VariantSpec instance. The original is not modified.

VariantSpec modified = spec.withVariant(AgentVariant.MEDIUM);
VariantSpec remodeled = spec.withModel("gpt-4-turbo");

ToolSet levels

The ToolSet controls which tools are available to the agent in a given variant. Lower tiers restrict tool access to reduce cost and latency.

Level	Description
`ALL`	All available tools
`STANDARD`	Most common tools
`MINIMAL`	Essential tools only
`NONE`	No tools

VariantManager

The VariantManager handles variant switching at runtime. It can operate in manual mode (you choose the variant) or auto mode (it analyzes each task and picks the best tier). It also tracks usage statistics per variant, so you can see how often each tier is used and how well it performs.

Creating a manager

Create a VariantManager with a default variant. If not specified, it defaults to MEDIUM.

VariantManager manager = new VariantManager();                    // defaults to MEDIUM
VariantManager manager = new VariantManager(AgentVariant.HIGH);   // explicit initial

Manual switching

Explicitly set the variant when you know what quality level the next task needs.

manager.setVariant(AgentVariant.HIGH);
AgentVariant current = manager.getCurrentVariant();   // HIGH
VariantSpec spec = manager.getCurrentSpec();           // VariantSpec for HIGH

Auto mode

In auto mode, the manager analyzes each task description and automatically switches to the most appropriate variant. This is ideal for production environments where tasks vary in complexity.

manager.setAutoMode(true);

// Analyzes task complexity (0-10 score) and switches automatically
AgentVariant suggested = manager.suggestVariant("Refactor the authentication system");
// suggested = HIGH, manager now using HIGH spec

Complexity scoring adds/subtracts from a base score of 5. Score >= 7 returns HIGH, score <= 3 returns MINI, otherwise MEDIUM. Task length is also considered (>200 chars adds 1, <50 chars subtracts 1).

Custom specs per variant

Override the default spec for any variant tier. This is useful when you want to use a different model or different token limits for a specific tier in your environment.

VariantSpec custom = VariantSpec.builder()
    .variant(AgentVariant.HIGH)
    .preferredModel("my-custom-model")
    .maxInputTokens(200000)
    .build();

manager.setVariantSpec(AgentVariant.HIGH, custom);

Change listeners

Register callbacks to be notified whenever the variant changes, whether manually or through auto mode. This is useful for logging, metrics, or adjusting other system behavior based on the active variant.

// Register
VariantManager.Registration reg = manager.onVariantChange(event -> {
    System.out.printf("Variant: %s -> %s (reason: %s)%n",
        event.previous(), event.current(), event.reason());
});

// Unregister
reg.unregister();

The VariantChangeEvent record contains previous, current, and reason (either "manual" or "auto:<task summary>").

Usage statistics

The manager tracks task count, success rate, and timing per variant. Use this data to understand your cost distribution and identify variants that are underperforming.

manager.recordTask(AgentVariant.HIGH, 1500, true);

VariantManager.VariantStats stats = manager.getStats(AgentVariant.HIGH);
stats.getTaskCount();          // total tasks
stats.getSuccessRate();        // 0.0 - 1.0
stats.getAverageDurationMs();  // average task time
stats.getMinDurationMs();
stats.getMaxDurationMs();

// All stats
Map<AgentVariant, VariantManager.VariantStats> all = manager.getAllStats();

@Variant Annotation

Some actions always need a specific quality level regardless of the agent's current setting -- a security audit should always use HIGH, while a text formatter can always use MINI. The @Variant annotation locks an action method to a specific variant tier. The framework temporarily switches to that variant for the duration of the action, then restores the previous one.

Attribute	Type	Default	Description
`value`	`AgentVariant`	(required)	Variant to use
`reason`	`String`	`""`	Documentation for variant choice
`recordStats`	`boolean`	`true`	Track usage statistics
`fallback`	`AgentVariant`	`MEDIUM`	Fallback if primary variant's model is unavailable

// Force HIGH for security-critical action
@ActionSpec(type = ActionType.LLM, description = "Security analysis")
@Variant(AgentVariant.HIGH)
public String analyzeSecurityRisks(String code) {
    return "Analyze security: " + code;
}

// Use MINI for a quick utility
@ActionSpec(type = ActionType.LOCAL, description = "Format text")
@Variant(AgentVariant.MINI)
public String formatText(String text) {
    return text.trim();
}

// Let the framework auto-select based on input
@ActionSpec(type = ActionType.LLM, description = "Code review")
@Variant(value = AgentVariant.AUTO, reason = "Complexity varies by input size")
public String reviewCode(String code) {
    return "Review: " + code;
}

Actions without @Variant use the agent's current variant. The annotation only affects the specific method it decorates.

Full Example

This end-to-end example shows how to set up an agent with auto variant selection, log all variant switches, run tasks of varying complexity, override the variant at runtime, and check usage statistics afterward.

// Configure agent with variant support
VariantManager variantManager = new VariantManager(AgentVariant.AUTO);
variantManager.setAutoMode(true);

// Log all variant switches
variantManager.onVariantChange(event ->
    log.info("Variant {} -> {} ({})", event.previous(), event.current(), event.reason()));

Agent agent = AgentBuilder.create()
    .withVariant(AgentVariant.AUTO)
    .llm(new OpenAIClient())
    .build();

// Simple task -- framework auto-selects MINI
agent.chat("Fix the typo in line 42");

// Complex task -- framework auto-selects HIGH
agent.chat("Refactor the authentication module for OAuth2 support");

// Override at runtime
agent.setVariant(AgentVariant.HIGH);
agent.chat("Critical security audit of payment processing");

// Check statistics
VariantManager.VariantStats highStats = variantManager.getStats(AgentVariant.HIGH);
System.out.printf("HIGH: %d tasks, %.0f%% success, avg %dms%n",
    highStats.getTaskCount(),
    highStats.getSuccessRate() * 100,
    highStats.getAverageDurationMs());