Reasoning
Advanced reasoning strategies for complex problem solving. TnsAI provides multiple reasoning executors based on recent AI research, from simple chain-of-thought to graph-based reasoning with merging and refinement.
ThinkingResult
When an LLM uses extended thinking (like Claude's chain-of-thought), TnsAI wraps the output in a ThinkingResult so you can inspect both the reasoning process and the final answer separately. This is useful for debugging, auditing, or displaying the model's step-by-step logic to users.
ThinkingResult result = ThinkingResult.builder()
.thinkingProcess("Step 1: Analyze the input... Step 2: Consider edge cases...")
.finalAnswer("The optimal solution is X because...")
.thinkingTokens(1200)
.outputTokens(350)
.thinkingBlocks(List.of("analysis block", "verification block"))
.build();
System.out.println(result.hasThinking()); // true
System.out.println(result.getTotalTokens()); // 1550
System.out.println(result.getFinalAnswer());| Method | Returns | Description |
|---|---|---|
getThinkingProcess() | String | Full internal reasoning text |
getFinalAnswer() | String | Answer produced after thinking |
getThinkingTokens() | int | Tokens consumed by thinking |
getOutputTokens() | int | Tokens in the final answer |
getTotalTokens() | int | Sum of thinking + output tokens |
getThinkingBlocks() | List<String> | Individual thinking blocks |
hasThinking() | boolean | True if thinking was performed |
Tree of Thoughts (ToT)
Explores multiple reasoning paths by generating candidate thoughts, evaluating them, and pruning low-quality branches. Based on "Tree of Thoughts: Deliberate Problem Solving with Large Language Models" (Yao et al., 2023).
TreeOfThoughtsExecutor
The executor manages the full exploration lifecycle: generating candidate thoughts at each depth, scoring them, pruning weak branches, and returning the best reasoning path found.
TreeOfThoughtsExecutor tot = TreeOfThoughtsExecutor.builder()
.llm(client)
.evaluator(BranchEvaluator.llm(evalClient))
.pruning(PruningStrategy.BEAM_SEARCH)
.beamWidth(3)
.maxDepth(5)
.branchingFactor(3)
.pruneThreshold(0.3)
.timeout(Duration.ofMinutes(10))
.build();
ToTResult result = tot.explore("Design a REST API for a todo app");
System.out.println(result.getBestPath());
System.out.println(result.getBestScore());Builder Parameters
These settings control how broadly and deeply the tree is explored, and when to stop.
| Parameter | Default | Description |
|---|---|---|
llm | required | LLMClient for thought generation |
evaluator | required | BranchEvaluator for scoring nodes |
pruning | BEAM_SEARCH | Pruning strategy |
beamWidth | 3 | Branches to keep per level |
maxDepth | 5 | Maximum tree depth |
branchingFactor | 3 | Candidate thoughts per node |
pruneThreshold | 0.3 | Minimum score to survive |
timeout | 10 min | Exploration time limit |
PruningStrategy
Pruning determines which branches to keep and which to discard during exploration. Choosing the right strategy lets you balance thoroughness against cost -- more aggressive pruning is faster and cheaper, while less pruning explores more possibilities.
| Strategy | Description |
|---|---|
BEAM_SEARCH | Keep top-k branches at each level (default) |
BEST_FIRST | Always expand the highest cumulative-score node |
DEPTH_LIMITED | Explore all branches up to max depth |
GREEDY | Always pick the single best branch |
MCTS | Monte Carlo Tree Search -- balance exploration vs exploitation |
EXHAUSTIVE | No pruning -- explore all branches (expensive) |
ToTResult
The result captures the full exploration tree, best path, and statistics:
ToTResult result = tot.explore("Optimize this algorithm");
result.getBestPath(); // Combined thought chain of best leaf
result.getBestScore(); // Average score along best path
result.hasSolution(); // true if bestScore > 0.5
result.getBestLeaf(); // Optional<ThoughtNode>
result.getTopPaths(3); // Top-3 leaf nodes by score
result.getTotalNodes(); // Total nodes explored
result.getPrunedNodes(); // Nodes pruned during exploration
result.getMaxDepthReached(); // Deepest level reached
result.getDuration(); // Total exploration timeThoughtNode
Each node in the tree represents a single reasoning step. Nodes track their score, depth, and links to parent and children, so you can traverse the full reasoning path from root to any leaf.
ThoughtNode root = ThoughtNode.root("Design a caching strategy");
ThoughtNode child = root.addChild("n1", "Use LRU cache with TTL");
child.setScore(0.85);
child.getThoughtChain(); // "Design a caching strategy -> Use LRU cache with TTL"
child.getCumulativeScore(); // Sum of scores along path
child.getAverageScore(); // Average score along path
child.getDepth(); // 1
child.isLeaf(); // true (no children yet)
child.isRoot(); // false
child.isEvaluated(); // true (score was set)
child.isPruned(); // false
child.getPath(); // [root, child]BranchEvaluator
The evaluator scores each reasoning step on a 0.0-1.0 scale, which the pruning strategy uses to decide which branches to keep. You can use an LLM to judge quality, a fast heuristic, or combine both.
// LLM-based: asks an LLM to rate the reasoning step 0-100
BranchEvaluator llmEval = BranchEvaluator.llm(evalClient);
// Heuristic: scores based on thought length and reasoning keywords
BranchEvaluator heuristic = BranchEvaluator.heuristic();
// Combined: averages multiple evaluators
BranchEvaluator combined = BranchEvaluator.combined(llmEval, heuristic);
// Custom evaluator
BranchEvaluator custom = (node, goal) -> {
return node.getThought().contains("therefore") ? 0.8 : 0.4;
};Graph of Thoughts (GoT)
Extension of ToT that allows merging and refining thought branches. Better for problems where partial solutions can be combined. Based on "Graph of Thoughts" (Besta et al., 2023).
GraphOfThoughtsExecutor
The executor drives the graph exploration, applying generate, aggregate, and refine operations to build a directed graph of thoughts rather than a strict tree.
GraphOfThoughtsExecutor got = GraphOfThoughtsExecutor.builder()
.llm(client)
.evaluator(BranchEvaluator.llm(client))
.operations(List.of(
GoTOperation.GENERATE,
GoTOperation.AGGREGATE,
GoTOperation.REFINE))
.maxNodes(20)
.branchingFactor(3)
.timeout(Duration.ofMinutes(10))
.build();
GoTResult result = got.explore("Design a database schema for e-commerce");
System.out.println(result.getBestThought());
System.out.println(result.aggregatedInsight());
System.out.println(result.mergeCount()); // How many merge operations occurredGoTOperation
Operations define what the graph executor does at each step. Unlike ToT which only generates and evaluates, GoT can also merge partial solutions together and iteratively refine them.
| Operation | Description |
|---|---|
GENERATE | Generate new thoughts from existing ones |
AGGREGATE | Merge multiple thoughts into a unified solution |
REFINE | Improve an existing thought iteratively |
SCORE | Evaluate a thought's quality |
GoTNode
Unlike tree nodes, a GoTNode can have multiple parents because merge operations combine separate reasoning branches into one. This makes it a true graph structure rather than a tree.
GoTNode root = GoTNode.root("Design a notification system");
GoTNode emailApproach = root.addChild("n1", "Use email queues", GoTOperation.GENERATE);
GoTNode pushApproach = root.addChild("n2", "Use push notifications", GoTOperation.GENERATE);
// Merge two approaches into one
GoTNode merged = GoTNode.merge("m1",
"Hybrid: email for async, push for real-time",
List.of(emailApproach, pushApproach));
merged.isMergeNode(); // true
merged.getParents().size(); // 2GoTResult
The result of a graph exploration, including the best thought found, aggregate insights from merging, and statistics about the exploration process.
GoTResult is a record with these fields:
GoTResult result = got.explore("...");
result.getBestThought(); // Content of the highest-scored node
result.getBestScore(); // Score of best node
result.totalNodes(); // Total nodes in graph
result.mergeCount(); // Number of AGGREGATE operations
result.hasSolution(); // true if bestScore > 0.5
result.aggregatedInsight(); // Synthesized insight from best nodes
result.duration(); // Exploration timeCausalReasoner
Causal reasoning engine for why, what-if, and intervention queries. Uses an LLM with a causal model description to analyze cause-effect relationships.
CausalReasoner reasoner = CausalReasoner.builder()
.llm(client)
.causalModel("Sales depend on marketing spend, season, and competitor pricing")
.build();
Map<String, Object> context = Map.of(
"q3_sales", 150000,
"marketing_budget", 50000,
"season", "summer"
);
// Why did something happen?
CausalResult why = reasoner.why("Why did sales drop in Q3?", context);
System.out.println(why.answer());
System.out.println(why.reasoning());
System.out.println(why.confidence()); // 0.0-1.0
// What-if counterfactual
CausalResult whatIf = reasoner.whatIf("What if we doubled marketing?", context);
// Intervention prediction
CausalResult intervene = reasoner.intervene("Cut prices by 15%", context);CausalQueryType
Three types of causal queries are supported, each answering a different kind of question about cause and effect.
| Type | Method | Purpose |
|---|---|---|
WHY | reasoner.why(...) | Explain why something happened |
WHAT_IF | reasoner.whatIf(...) | Counterfactual reasoning |
INTERVENTION | reasoner.intervene(...) | Predict effect of an action |
CausalResult
The result of a causal query, containing the explanation or prediction along with a confidence score indicating how certain the model is.
| Field | Type | Description |
|---|---|---|
queryType | CausalQueryType | Which type of query was made |
answer | String | The causal explanation or prediction |
reasoning | String | Step-by-step reasoning chain |
confidence | double | Confidence score (0.0-1.0) |
SelfConsistencyExecutor
Generates multiple reasoning paths and returns the consensus answer via majority voting or other aggregation. Based on "Self-Consistency Improves Chain of Thought Reasoning" (Wang et al., 2022).
SelfConsistencyExecutor executor = SelfConsistencyExecutor.builder()
.llm(client)
.numPaths(5)
.aggregation(Aggregation.MAJORITY_VOTE)
.baseTemperature(0.7)
.temperatureVariance(0.1)
.parallel(true)
.maxConcurrency(5)
.timeout(Duration.ofMinutes(5))
.systemPrompt("You are a math tutor.")
.build();
ConsistencyResult result = executor.reason("What is 17 * 23?");
System.out.println(result.getConsensusAnswer()); // "391"
System.out.println(result.getConfidence()); // 0.8 (4/5 paths agreed)
System.out.println(result.isUnanimous()); // false
System.out.println(result.getAnswerCounts()); // {"391"=4, "392"=1}Builder Parameters
Configure how many reasoning paths to generate and how to combine them into a final answer.
| Parameter | Default | Description |
|---|---|---|
llm | required | LLMClient for reasoning |
numPaths | 5 | Number of reasoning paths to generate |
aggregation | MAJORITY_VOTE | How to combine answers |
baseTemperature | 0.7 | Base LLM temperature |
temperatureVariance | 0.1 | Variance between path temperatures |
systemPrompt | none | Optional system prompt |
parallel | true | Execute paths in parallel |
maxConcurrency | 5 | Max parallel threads |
timeout | 5 min | Timeout for parallel execution |
Aggregation
The aggregation strategy determines how multiple reasoning paths are combined into a single consensus answer. Majority vote is the simplest and most common choice.
| Strategy | Description |
|---|---|
MAJORITY_VOTE | Most common answer wins (default) |
WEIGHTED_VOTE | Weight by reasoning chain length |
UNANIMOUS | Require all paths to agree |
THRESHOLD | First answer appearing in \> 50% of paths |
LLM_SYNTHESIS | Use LLM to synthesize best answer from all paths |
ConsistencyResult
The result tells you which answer won the vote, how confident the consensus is, and lets you inspect each individual reasoning path for debugging.
ConsistencyResult result = executor.reason("...");
result.getConsensusAnswer(); // The winning answer
result.getConfidence(); // Ratio of paths that agreed
result.getAllPaths(); // All ReasoningPath objects
result.getAnswerCounts(); // Map<String, Integer> of answer frequencies
result.getTotalPaths(); // Number of paths generated
result.isUnanimous(); // true if confidence >= 1.0
result.isConfident(0.8); // true if confidence >= thresholdChoosing a Strategy
Pick the strategy that matches your problem type and cost budget. Simple factual questions work well with self-consistency, while complex design problems benefit from tree or graph exploration.
| Strategy | Best For | Cost |
|---|---|---|
| Tree of Thoughts | Step-by-step decomposition problems | High (branching factor x depth LLM calls) |
| Graph of Thoughts | Problems where partial solutions combine | Higher (includes merge/refine operations) |
| Self-Consistency | Factual questions with verifiable answers | Medium (N parallel LLM calls) |
| Causal Reasoning | Diagnosing causes and predicting interventions | Low (single LLM call per query) |
RAG Strategy SPI
TnsAI.Intelligence provides a pluggable Retrieval-Augmented Generation (RAG) framework with three built-in strategies and a composable pipeline. Package: `com.tnsai.intelligence.rag`.
Audio & Speech
The `WhisperClient` provides speech-to-text capabilities powered by OpenAI's Whisper model. It supports transcription in multiple languages and translation of non-English audio to English.