RAG Strategy SPI
TnsAI.Intelligence provides a pluggable Retrieval-Augmented Generation (RAG) framework with three built-in strategies and a composable pipeline. Package: `com.tnsai.intelligence.rag`.
RAGStrategy Interface
Every retrieval strategy implements this interface. You call retrieve(query, topK) with a user query and the number of results you want, and the strategy returns the most relevant documents it can find.
public interface RAGStrategy {
List<RetrievalResult> retrieve(String query, int topK);
String name();
}Each strategy returns ranked RetrievalResult objects containing the retrieved text, a relevance score, and source metadata.
public record RetrievalResult(
String content,
double score,
Map<String, Object> metadata
) {}VectorRAGStrategy
Dense vector retrieval using embedding similarity. Best for semantic matching where exact keywords may not appear in the source documents.
RAGStrategy vectorRAG = VectorRAGStrategy.builder()
.embeddingClient(embeddingClient)
.vectorStore(vectorStore)
.similarityThreshold(0.7)
.build();
List<RetrievalResult> results = vectorRAG.retrieve("How do agents communicate?", 5);| Parameter | Default | Description |
|---|---|---|
embeddingClient | required | Client for generating embeddings |
vectorStore | required | Vector database for similarity search |
similarityThreshold | 0.7 | Minimum cosine similarity to include |
KeywordRAGStrategy
Sparse retrieval using BM25 scoring. Best for queries with specific technical terms, identifiers, or exact phrases.
RAGStrategy keywordRAG = KeywordRAGStrategy.builder()
.index(bm25Index)
.build();
List<RetrievalResult> results = keywordRAG.retrieve("ContextCompactor interface", 5);HybridRAGStrategy
Combines vector and keyword strategies using Reciprocal Rank Fusion (RRF) to merge result lists. This gives the best of both semantic and lexical matching.
RAGStrategy hybridRAG = HybridRAGStrategy.builder()
.vectorStrategy(vectorRAG)
.keywordStrategy(keywordRAG)
.vectorWeight(0.6)
.keywordWeight(0.4)
.fusionK(60) // RRF constant
.build();
List<RetrievalResult> results = hybridRAG.retrieve("agent memory persistence", 10);| Parameter | Default | Description |
|---|---|---|
vectorStrategy | required | Dense retrieval strategy |
keywordStrategy | required | Sparse retrieval strategy |
vectorWeight | 0.6 | Weight for vector results in fusion |
keywordWeight | 0.4 | Weight for keyword results in fusion |
fusionK | 60 | RRF smoothing constant (higher = more equal weighting) |
RAGPipeline
A pipeline wraps a retrieval strategy with optional query rewriting (to improve recall) and result reranking (to improve precision). This lets you build a complete retrieval system by composing simple, testable components.
RAGPipeline pipeline = RAGPipeline.builder()
.strategy(hybridRAG)
.queryRewriter(query -> expandAcronyms(query))
.reranker((results, query) -> crossEncoderRerank(results, query))
.maxResults(5)
.build();
List<RetrievalResult> results = pipeline.execute("How does RRF fusion work?");Pipeline Stages
The pipeline processes a query through three stages. Each stage is optional -- you can use just a strategy, or add rewriting and reranking for better results.
User Query
|
v
Query Rewriter (optional) -- expand, rephrase, or decompose the query
|
v
RAGStrategy.retrieve() -- fetch candidates from one or more sources
|
v
Reranker (optional) -- re-score and re-order results
|
v
Top-K Selection -- return final results| Stage | Interface | Description |
|---|---|---|
| Query Rewriter | Function<String, String> | Transform the query before retrieval |
| Strategy | RAGStrategy | Core retrieval (vector, keyword, or hybrid) |
| Reranker | BiFunction<List<RetrievalResult>, String, List<RetrievalResult>> | Re-score results using a cross-encoder or other model |
Integration with Agents
Once you have a RAG pipeline, you can wire it directly into an agent. The agent will automatically retrieve relevant documents before generating each response, so it can answer questions grounded in your data.
Agent agent = AgentBuilder.create()
.model("claude-sonnet-4")
.ragPipeline(pipeline)
.build();
// The agent automatically retrieves relevant context before generating responses
String response = agent.chat("Explain the memory architecture");Choosing a Strategy
Pick the strategy that matches your data and query patterns. For most production systems, hybrid gives the best results by combining semantic understanding with exact term matching.
| Strategy | Strengths | Weaknesses | Best For |
|---|---|---|---|
| Vector | Semantic understanding, handles paraphrasing | Misses exact terms, requires embeddings | Natural language queries |
| Keyword | Fast, exact term matching, no embeddings needed | No semantic understanding | Technical docs, code search |
| Hybrid | Best overall recall, handles both semantic and lexical | Higher latency (two retrievals + fusion) | Production RAG systems |
Planning
Goal-oriented planning for AI agents. TnsAI provides three planner implementations: annotation-driven backward chaining, utility-based scoring, and LLM-powered dynamic planning with human-in-the-loop approval and adaptive replanning.
Reasoning
Advanced reasoning strategies for complex problem solving. TnsAI provides multiple reasoning executors based on recent AI research, from simple chain-of-thought to graph-based reasoning with merging and refinement.