TnsAI
Server

RAG Pipeline

The server provides a per-session Retrieval-Augmented Generation pipeline that indexes local codebases, chunks source files by language boundaries, and retrieves relevant context using hybrid BM25 + vector search with Reciprocal Rank Fusion.

Architecture Overview

The RAG pipeline has three stages: indexing (scanning files and splitting them into chunks), storage (keeping chunks in an in-memory knowledge base with BM25 and vector indexes), and retrieval (finding the most relevant chunks for a user's query using hybrid search).

Directory  -->  FileIndexer  -->  CodeChunker  -->  KnowledgeBase (in-memory)
                                                          |
User Query  -->  HybridRetriever  -->  [BM25Stream 60%]  -+  RRF  -->  Results
                                  -->  [VectorStream 40%] -+

Each session gets its own RagService, lazily created by SessionManager.getRag(sessionId). The service is thread-safe: indexing is serialized via a ReentrantLock, while reads (search) run concurrently.

RagService

The central orchestrator for a session's RAG pipeline.

RagService rag = sessionManager.getRag("my-session");

// Index a directory
rag.indexDirectory(Path.of("/project/src"), progress -> {
    System.out.printf("Indexed %d/%d: %s%n",
        progress.indexedFiles(), progress.totalFiles(), progress.currentFile());
});

// Search
List<SearchResult> results = rag.search("authentication middleware", 5);

// Build augmented prompt (auto-prepends context)
String prompt = rag.buildContextPrompt("How does auth work?", 5);

// Document management
String docId = rag.addDocument("Custom knowledge...", Map.of("source", "manual"));
rag.removeDocument(docId);
List<RagService.DocumentInfo> docs = rag.listDocuments();

The hybrid retriever is configured at construction with BM25 at 60% weight and the vector knowledge base at 40%:

this.hybridRetriever = HybridRetriever.builder()
    .stream(bm25Stream, 0.6)
    .stream(new KnowledgeBaseStream(knowledgeBase), 0.4)
    .build();

FileIndexer

The FileIndexer recursively walks a directory, identifies source files by extension, splits them into chunks using CodeChunker, and stores the chunks in the knowledge base. It supports incremental indexing so only changed files are re-processed on subsequent runs.

Supported Extensions (28+)

The indexer recognizes 28+ file extensions covering most popular programming languages and configuration formats.

java, ts, tsx, js, jsx, py, md, json, yml, yaml, xml, html, css, sh, sql, go, rs, rb, kt, scala, c, cpp, h -- plus language aliases (kts, bash, zsh, markdown, htm, cc, cxx, hpp, sc).

Filtering

The indexer automatically skips build artifacts, dependency directories, and files matching your .gitignore patterns to avoid polluting the knowledge base with irrelevant content.

  • Skipped directories: .git, node_modules, build, dist, target, .idea, .vscode, .gradle, __pycache__, vendor, .next, out, coverage, .svn, .hg
  • Ignore files: Reads .gitignore and .tnsignore from the root, converting glob patterns to Java PathMatcher instances
  • Size limit: Files larger than 512 KB or empty files are skipped

Incremental Indexing

To avoid re-processing unchanged files, the indexer computes a SHA-256 hash of each file's content and stores it in a ConcurrentHashMap<String, String>. On re-index:

  1. If the hash matches the previous run, the file is skipped
  2. If the file changed, old chunks are removed from both KnowledgeBase and BM25Stream
  3. New chunks are generated and added

Call fileIndexer.clearHashes() to force a full re-index.

CodeChunker

The CodeChunker splits source files into semantically meaningful chunks -- for example, by class or function boundaries in Java/TypeScript, or by headings in Markdown. This ensures that search results return coherent, self-contained code blocks rather than arbitrary line ranges.

Chunking Strategies

The chunker picks a strategy based on the file's language. Languages with known structure get smarter splitting; everything else falls back to fixed-size line groups.

LanguageStrategyBoundary Detection
Java, Kotlin, ScalaClass/method boundariesRegex: class/interface/enum/record declarations + method signatures
TypeScript, JavaScriptFunction/class boundariesRegex: export/function/class/const arrow declarations
MarkdownHeading boundariesRegex: #{1-6} heading lines
Everything elseFixed line groupsMax 100 lines per chunk

Small files (100 lines or fewer) are always kept as a single chunk. Large boundary-detected chunks are sub-split into 100-line groups.

Each chunk becomes a Document with metadata:

Document.builder()
    .id("src/auth/Middleware.java:15-45")
    .content(chunkContent)
    .metadata("file", "src/auth/Middleware.java")
    .metadata("startLine", 15)
    .metadata("endLine", 45)
    .metadata("language", "java")
    .build();

BM25Stream

The BM25Stream provides keyword-based search using the Okapi BM25 algorithm, which is the same ranking function used by search engines like Elasticsearch. It scores documents based on how well their terms match the query, accounting for term frequency and document length.

Parameters

These BM25 parameters control how the scoring behaves. The defaults work well for code search and rarely need tuning.

ParameterValueDescription
K11.2Term frequency saturation
B0.75Document length normalization

Text Processing Pipeline

Before scoring, queries and documents go through a text processing pipeline that normalizes, tokenizes, and stems terms. This improves recall by matching different forms of the same word.

  1. Tokenization: Lowercase, strip non-alphanumeric (except _), split on whitespace, drop tokens with 1 character or fewer
  2. Stop word removal: 50 common English stop words
  3. Stemming: Suffix-stripping rules for 14 suffixes (-ies, -ing, -tion, -sion, -ment, -ness, -able, -ous, -ful, -less, -ly, -ed, -er, -es, -s)
  4. Synonym expansion (query-time only): 20 coding-domain synonym pairs

Synonym Pairs

At query time, common coding abbreviations are expanded to their full forms (and vice versa) so that searching for "auth" also finds documents containing "authentication".

TermSynonyms
dbdatabase
authauthentication, authorization
configconfiguration
perfperformance
implimplementation
reqrequest
resresponse
errerror
msgmessage
fnfunction
paramparameter
reporepository
envenvironment
asyncasynchronous
syncsynchronous

HybridRetriever

The HybridRetriever combines results from multiple search strategies (like BM25 keyword search and vector similarity search) into a single ranked list. This hybrid approach gives better results than either method alone because keyword search finds exact term matches while vector search captures semantic similarity.

Fusion Algorithm

The retriever merges results using Reciprocal Rank Fusion (RRF), which combines rankings without needing normalized scores. For each document appearing in any stream's results:

score(doc) = SUM over streams: weight(stream) / (K + rank(doc, stream) + 1)

Where K = 60 (the RRF constant). Documents are then sorted by fused score.

Diversification

To prevent a single large file from dominating search results, the retriever limits output to a maximum of 3 chunks per source file. This ensures the agent sees context from multiple relevant files.

HybridRetriever retriever = HybridRetriever.builder()
    .stream(bm25Stream, 0.6)       // 60% weight
    .stream(vectorStream, 0.4)      // 40% weight
    .build();

List<SearchResult> results = retriever.retrieve("authentication flow", 10);

Context Prompt Format

When the agent asks a question, RagService.buildContextPrompt searches for relevant code and prepends it to the user's query. This gives the LLM the codebase context it needs to answer accurately.

[Relevant code context]
--- file: src/auth/Middleware.java (lines 15-45) ---
public class AuthMiddleware {
    private final TokenValidator validator;
    ...
}

--- file: src/auth/TokenValidator.java (lines 1-30) ---
public class TokenValidator {
    ...
}

[User question]
How does the authentication middleware work?

If no context is found (empty knowledge base or no matches), the original query is returned unchanged.

Document Management API

Beyond automatic directory indexing, you can manually add, list, and remove documents in the knowledge base. This is useful for injecting custom knowledge (like deployment procedures or domain-specific documentation) that is not part of the codebase.

// Add a document with metadata
String docId = rag.addDocument("Custom knowledge content",
    Map.of("source", "user", "topic", "deployment"));

// List documents (returns preview, length, metadata)
List<RagService.DocumentInfo> docs = rag.listDocuments();
// DocumentInfo(id, preview(100chars), contentLength, metadata)

// Get a specific document
Optional<Document> doc = rag.getDocument(docId);

// Remove
boolean removed = rag.removeDocument(docId);

// Clear everything
rag.clear();

Documents added via addDocument are tracked separately and appear in listDocuments(). Both manually added documents and file-indexed chunks are searchable through the same hybrid retriever.

On this page