Streaming
TnsAI supports three streaming modes for real-time token delivery from LLM providers.
Token Streaming
Returns text tokens as they are generated — simplest mode:
Stream<String> tokens = agent.streamChat("Explain relativity");
tokens.forEach(System.out::print);ChatChunk Streaming
Returns typed chunks with metadata (token counts, finish reason, tool calls):
llmClient.streamChatWithSpec(request).forEach(chunk -> {
switch (chunk.getType()) {
case START -> System.out.println("Stream started: " + chunk.getModel());
case CONTENT -> System.out.print(chunk.getContent());
case TOOL_CALL -> handleToolCall(chunk.getToolCall().orElseThrow());
case DONE -> System.out.println("\nTokens: " + chunk.getTokenCount());
case ERROR -> System.err.println("Error: " + chunk.getContent());
}
});Chunk Types
Each ChatChunk has a type that tells you what kind of data it carries. Your code should handle each type to respond appropriately as the stream progresses.
| Type | Description |
|---|---|
START | Stream initialization with model info |
CONTENT | Text content delta |
TOOL_CALL | Tool/function invocation request |
DONE | Stream complete with finish reason |
ERROR | Error occurred during streaming |
Finish Reasons
When a stream ends, the DONE chunk includes a finish reason that explains why the LLM stopped generating. This helps you decide what to do next -- for example, if the reason is TOOL_CALLS, you need to execute the requested tool and feed the result back.
| Reason | Description |
|---|---|
STOP | Natural completion |
LENGTH | Max tokens reached |
TOOL_CALLS | LLM wants to call tools |
CONTENT_FILTER | Content was filtered |
ERROR | Error during generation |
Handler-Based Streaming
Callback pattern with full tool-call loop — ideal for UI integration:
llmClient.streamChatWithHandler(request, chunk -> {
if (chunk.isContent()) {
System.out.print(chunk.getContent());
} else if (chunk.isToolCall()) {
// Framework handles tool execution automatically
} else if (chunk.isDone()) {
System.out.println("\nFinish reason: " + chunk.getFinishReason());
}
});Convenience Methods
ChatChunk provides static factory methods so you can create chunks without calling constructors directly. These are useful when you build custom streaming pipelines or write tests that simulate LLM output.
// ChatChunk factory methods
ChatChunk.start(model, requestId);
ChatChunk.content("Hello", tokenCount, index);
ChatChunk.content("Hello");
ChatChunk.toolCall(toolCallObject);
ChatChunk.done(FinishReason.STOP, totalTokens);
ChatChunk.error("Something went wrong");Which Mode to Use?
TnsAI offers three streaming modes at different levels of abstraction. Pick the simplest one that meets your needs.
| Mode | Use When |
|---|---|
| Token Stream | Simple text display, CLI output |
| ChatChunk Stream | Need metadata (tokens, model), manual tool handling |
| Handler-Based | UI integration, automatic tool execution loop |
Async Execution
The AsyncAgent interface (com.tnsai.agents.async.AsyncAgent) provides non-blocking chat operations with multiple consumption patterns.
Methods
AsyncAgent exposes several ways to consume responses. Choose based on whether you need simple text, typed events, or reactive backpressure control.
| Method | Return Type | Description |
|---|---|---|
chatAsync(message) | CompletableFuture<String> | Async chat, completes with full response |
chatAsync(message, options) | CompletableFuture<String> | Async chat with ChatOptions |
chatStream(message) | Stream<String> | Streaming tokens as a Java Stream |
chatEventStream(message) | Stream<ChatEvent> | Typed event stream (tokens, tool calls, etc.) |
chatPublisher(message) | Flow.Publisher<ChatEvent> | Reactive Streams publisher for backpressure-aware consumers |
cancel() | void | Cancels any ongoing async operation |
isProcessing() | boolean | True if an async operation is in progress |
getProgress() | double | Execution progress (0.0 - 1.0) |
CompletableFuture
The simplest async pattern. chatAsync returns a CompletableFuture that completes with the full response string once the LLM finishes generating. Use this when you do not need to show partial results to the user.
AsyncAgent agent = new MyAsyncAgent();
agent.chatAsync("Tell me about Java")
.thenAccept(response -> System.out.println(response))
.exceptionally(e -> { e.printStackTrace(); return null; });Token Stream
Returns a Stream<String> that emits each text token as it arrives. This lets you print tokens to the console (or a UI) incrementally instead of waiting for the full response.
agent.chatStream("Tell me a story")
.forEach(token -> System.out.print(token));Typed Event Stream
ChatEvent subtypes distinguish tokens from tool calls and other events:
agent.chatEventStream("Complex task")
.forEach(event -> {
if (event instanceof ChatEvent.Token t) {
System.out.print(t.content());
} else if (event instanceof ChatEvent.ToolCall tc) {
System.out.println("Calling tool: " + tc.toolName());
}
});Reactive Publisher
For backpressure-aware consumers using java.util.concurrent.Flow:
agent.chatPublisher("Generate a report")
.subscribe(new Flow.Subscriber<>() {
private Flow.Subscription subscription;
@Override
public void onSubscribe(Flow.Subscription s) {
this.subscription = s;
s.request(1);
}
@Override
public void onNext(ChatEvent event) {
process(event);
subscription.request(1);
}
@Override
public void onError(Throwable t) { t.printStackTrace(); }
@Override
public void onComplete() { System.out.println("Done"); }
});Cancellation
You can cancel a running async operation at any time. This is useful for timeout handling or when the user navigates away from a page before the response finishes.
CompletableFuture<String> future = agent.chatAsync("Long running task");
// Cancel if still running
if (agent.isProcessing()) {
agent.cancel();
}Security & Approvals
TnsAI provides a layered security model for agent actions: approval tokens for human-in-the-loop gating, AG-UI interrupts for blocking agent execution until a user responds, input/output guardrails for validation and sanitization, and a declarative `@Security` annotation that combines audit, access control, and encryption policies in one place.
Tool Integration
A `Tool` is an external capability that an agent can invoke during a conversation. Tools are registered via `AgentBuilder.tool()` and exposed to the LLM as function definitions. The LLM decides when and how to call them.