Cost Tracking
Monitor and control LLM spending across providers with built-in cost tracking, budget management, and model pricing data for 100+ models.
Setup
Wrap any LLMClient with CostAwareLLMClient to start tracking costs automatically. Every request records its token usage and calculates cost based on the model's pricing.
// Wrap with default cost tracking
LLMClient tracked = CostAwareLLMClient.wrap(client);
// Or use the builder for full control
CostAwareLLMClient tracked = CostAwareLLMClient.builder()
.client(client)
.costTracker(new InMemoryCostTracker())
.budgetManager(budget)
.build();Querying Costs
Access accumulated cost data at any time. You can get the total across all models, break down by individual model, or query a specific time range.
tracker.getTotalCost(); // Total across all models
tracker.getCostByModel("gpt-4o"); // Per-model breakdown
tracker.getCostInRange(startTime, endTime); // Time-range queryBudget Management
Set spending limits to prevent runaway costs. When the budget is exceeded, the CostAwareLLMClient will reject further requests. You can also set an alert threshold to get early warnings before hitting the limit.
BudgetManager budget = BudgetManager.builder()
.limit(new BigDecimal("50.00"))
.daily() // Duration.ofDays(1)
.alertThreshold(0.80)
.build();
// Or monthly budget
BudgetManager monthly = BudgetManager.builder()
.limit(new BigDecimal("500.00"))
.monthly() // Duration.ofDays(30)
.alertThreshold(0.80)
.build();Model Pricing
TnsAI ships with built-in pricing data for 100+ models so cost calculations work out of the box. Prices are in USD per 1 million tokens.
| Model | Input | Output |
|---|---|---|
| gpt-4o | $2.50 | $10.00 |
| gpt-4o-mini | $0.15 | $0.60 |
| claude-sonnet-4 | $3.00 | $15.00 |
| claude-opus-4 | $15.00 | $75.00 |
| claude-3.5-haiku | $0.80 | $4.00 |
| gemini-2.5-pro | $1.25 | $10.00 |
| gemini-2.5-flash | $0.15 | $0.60 |
| gemini-2.0-flash | $0.075 | $0.30 |
Programmatic Usage
Look up pricing for any model and calculate costs for a given token count.
ModelPricing pricing = ModelPricing.forModel("gpt-4o");
BigDecimal inputCost = pricing.calculateInputCost(1000); // 1000 tokens
BigDecimal outputCost = pricing.calculateOutputCost(500);Usage Records
If you need to record usage manually (for example, from external API calls), you can create UsageRecord objects and pass them to the tracker directly.
tracker.record(UsageRecord.builder()
.modelId("gpt-4o")
.inputTokens(1000)
.outputTokens(500)
.build());LLM Caching
Reduce latency and cost with semantic response caching. The cache uses similarity matching so that near-identical prompts return cached responses without hitting the API.
LLM Providers
The LLM module provides a unified interface to 14 language model providers. All providers implement the `LLMClient` interface from Core, so you can swap providers without changing your agent code.