Agent Variants
Agent variants let you trade off between response quality, execution speed, and token cost. A single agent can switch variants at runtime -- per task or per action.
AgentVariant Enum
The four variant tiers represent different quality/speed/cost tradeoffs. Pick the one that matches your task, or use AUTO to let the framework decide at runtime based on task complexity.
Defined in com.tnsai.enums.AgentVariant. Four tiers:
| Variant | Quality | Speed | Cost | Best For |
|---|---|---|---|---|
HIGH | Max (1.0) | Slow (0.3) | High (1.0) | Complex refactoring, security review, architecture |
MEDIUM | Balanced (0.7) | Normal (0.6) | Medium (0.5) | Regular development, feature implementation |
MINI | Basic (0.4) | Fast (1.0) | Low (0.2) | Quick fixes, typo corrections, simple queries |
AUTO | Adaptive | Adaptive | Optimal | Production environments with varied workloads |
Helper methods
These convenience methods let you check what a variant prioritizes without comparing enum values directly.
variant.isQualityFocused(); // true for HIGH, MEDIUM
variant.isSpeedFocused(); // true for MINI
variant.isCostOptimized(); // true for MINI, AUTOTask-based suggestion
If you are not sure which variant to use, forTask() analyzes the task description and suggests one based on keyword matching. This is a simple heuristic -- for smarter auto-selection, use VariantManager with auto mode enabled.
AgentVariant.forTask("Refactor the auth system"); // HIGH
AgentVariant.forTask("Fix a typo in README"); // MINI
AgentVariant.forTask("Implement login page"); // MEDIUMKeywords that push toward HIGH: refactor, architect, complex, critical, security, review.
Keywords that push toward MINI: typo, fix, simple, quick, minor, small.
VariantSpec
Each variant tier has a VariantSpec that defines its concrete settings: which LLM model to use, token limits, available tools, timeout, retry count, and temperature. You can use the built-in specs or build a custom one for your specific models and requirements.
Predefined specs
The built-in specs for each tier ship with sensible defaults for model selection, token limits, and timeouts. These are what you get when you use a variant without customization.
| HIGH | MEDIUM | MINI | |
|---|---|---|---|
| Preferred model | claude-opus-4 | claude-sonnet-4 | claude-haiku-3 |
| Fallback models | gpt-4, gemini-1.5-pro | gpt-4o, gemini-1.5-flash | gpt-4o-mini, gemini-1.5-flash-8b |
| Max input tokens | 128,000 | 64,000 | 32,000 |
| Max output tokens | 16,384 | 8,192 | 4,096 |
| Tool set | ALL | STANDARD | MINIMAL |
| Timeout | 10 min | 5 min | 2 min |
| Max retries | 3 | 2 | 1 |
| Temperature | 0.7 | 0.5 | 0.3 |
| Streaming | Yes | Yes | No |
AUTO defaults to the MEDIUM spec and adjusts dynamically at runtime.
Using predefined specs
Retrieve the built-in spec for a variant tier with VariantSpec.forVariant() and query its settings.
VariantSpec highSpec = VariantSpec.forVariant(AgentVariant.HIGH);
String model = highSpec.getPreferredModel(); // "claude-opus-4"
int inputTokens = highSpec.getMaxInputTokens(); // 128000
Duration timeout = highSpec.getTimeout(); // PT10MBuilding a custom spec
When the built-in specs do not match your environment (different models, different limits), build a custom one. Custom specs are immutable -- once built, they cannot be changed.
VariantSpec custom = VariantSpec.builder()
.variant(AgentVariant.HIGH)
.preferredModel("claude-opus-4")
.fallbackModels(List.of("gpt-4", "gemini-1.5-pro"))
.maxInputTokens(128000)
.maxOutputTokens(16384)
.toolSet(VariantSpec.ToolSet.ALL)
.timeout(Duration.ofMinutes(10))
.maxRetries(3)
.temperature(0.7)
.enableStreaming(true)
.addSetting("customKey", "value")
.build();Model resolution
When the preferred model is unavailable (API outage, not provisioned), getEffectiveModel automatically falls back to the next available model from the fallback list.
Set<String> available = Set.of("gpt-4o", "claude-haiku-3");
String model = highSpec.getEffectiveModel(available); // "gpt-4o" (fallback)Immutable copies
Since specs are immutable, changing a field returns a new VariantSpec instance. The original is not modified.
VariantSpec modified = spec.withVariant(AgentVariant.MEDIUM);
VariantSpec remodeled = spec.withModel("gpt-4-turbo");ToolSet levels
The ToolSet controls which tools are available to the agent in a given variant. Lower tiers restrict tool access to reduce cost and latency.
| Level | Description |
|---|---|
ALL | All available tools |
STANDARD | Most common tools |
MINIMAL | Essential tools only |
NONE | No tools |
VariantManager
The VariantManager handles variant switching at runtime. It can operate in manual mode (you choose the variant) or auto mode (it analyzes each task and picks the best tier). It also tracks usage statistics per variant, so you can see how often each tier is used and how well it performs.
Creating a manager
Create a VariantManager with a default variant. If not specified, it defaults to MEDIUM.
VariantManager manager = new VariantManager(); // defaults to MEDIUM
VariantManager manager = new VariantManager(AgentVariant.HIGH); // explicit initialManual switching
Explicitly set the variant when you know what quality level the next task needs.
manager.setVariant(AgentVariant.HIGH);
AgentVariant current = manager.getCurrentVariant(); // HIGH
VariantSpec spec = manager.getCurrentSpec(); // VariantSpec for HIGHAuto mode
In auto mode, the manager analyzes each task description and automatically switches to the most appropriate variant. This is ideal for production environments where tasks vary in complexity.
manager.setAutoMode(true);
// Analyzes task complexity (0-10 score) and switches automatically
AgentVariant suggested = manager.suggestVariant("Refactor the authentication system");
// suggested = HIGH, manager now using HIGH specComplexity scoring adds/subtracts from a base score of 5. Score >= 7 returns HIGH, score <= 3 returns MINI, otherwise MEDIUM. Task length is also considered (>200 chars adds 1, <50 chars subtracts 1).
Custom specs per variant
Override the default spec for any variant tier. This is useful when you want to use a different model or different token limits for a specific tier in your environment.
VariantSpec custom = VariantSpec.builder()
.variant(AgentVariant.HIGH)
.preferredModel("my-custom-model")
.maxInputTokens(200000)
.build();
manager.setVariantSpec(AgentVariant.HIGH, custom);Change listeners
Register callbacks to be notified whenever the variant changes, whether manually or through auto mode. This is useful for logging, metrics, or adjusting other system behavior based on the active variant.
// Register
VariantManager.Registration reg = manager.onVariantChange(event -> {
System.out.printf("Variant: %s -> %s (reason: %s)%n",
event.previous(), event.current(), event.reason());
});
// Unregister
reg.unregister();The VariantChangeEvent record contains previous, current, and reason (either "manual" or "auto:<task summary>").
Usage statistics
The manager tracks task count, success rate, and timing per variant. Use this data to understand your cost distribution and identify variants that are underperforming.
manager.recordTask(AgentVariant.HIGH, 1500, true);
VariantManager.VariantStats stats = manager.getStats(AgentVariant.HIGH);
stats.getTaskCount(); // total tasks
stats.getSuccessRate(); // 0.0 - 1.0
stats.getAverageDurationMs(); // average task time
stats.getMinDurationMs();
stats.getMaxDurationMs();
// All stats
Map<AgentVariant, VariantManager.VariantStats> all = manager.getAllStats();@Variant Annotation
Some actions always need a specific quality level regardless of the agent's current setting -- a security audit should always use HIGH, while a text formatter can always use MINI. The @Variant annotation locks an action method to a specific variant tier. The framework temporarily switches to that variant for the duration of the action, then restores the previous one.
| Attribute | Type | Default | Description |
|---|---|---|---|
value | AgentVariant | (required) | Variant to use |
reason | String | "" | Documentation for variant choice |
recordStats | boolean | true | Track usage statistics |
fallback | AgentVariant | MEDIUM | Fallback if primary variant's model is unavailable |
// Force HIGH for security-critical action
@ActionSpec(type = ActionType.LLM, description = "Security analysis")
@Variant(AgentVariant.HIGH)
public String analyzeSecurityRisks(String code) {
return "Analyze security: " + code;
}
// Use MINI for a quick utility
@ActionSpec(type = ActionType.LOCAL, description = "Format text")
@Variant(AgentVariant.MINI)
public String formatText(String text) {
return text.trim();
}
// Let the framework auto-select based on input
@ActionSpec(type = ActionType.LLM, description = "Code review")
@Variant(value = AgentVariant.AUTO, reason = "Complexity varies by input size")
public String reviewCode(String code) {
return "Review: " + code;
}Actions without @Variant use the agent's current variant. The annotation only affects the specific method it decorates.
Full Example
This end-to-end example shows how to set up an agent with auto variant selection, log all variant switches, run tasks of varying complexity, override the variant at runtime, and check usage statistics afterward.
// Configure agent with variant support
VariantManager variantManager = new VariantManager(AgentVariant.AUTO);
variantManager.setAutoMode(true);
// Log all variant switches
variantManager.onVariantChange(event ->
log.info("Variant {} -> {} ({})", event.previous(), event.current(), event.reason()));
Agent agent = AgentBuilder.create()
.withVariant(AgentVariant.AUTO)
.llm(new OpenAIClient())
.build();
// Simple task -- framework auto-selects MINI
agent.chat("Fix the typo in line 42");
// Complex task -- framework auto-selects HIGH
agent.chat("Refactor the authentication module for OAuth2 support");
// Override at runtime
agent.setVariant(AgentVariant.HIGH);
agent.chat("Critical security audit of payment processing");
// Check statistics
VariantManager.VariantStats highStats = variantManager.getStats(AgentVariant.HIGH);
System.out.printf("HIGH: %d tasks, %.0f%% success, avg %dms%n",
highStats.getTaskCount(),
highStats.getSuccessRate() * 100,
highStats.getAverageDurationMs());