Learning and Refinement
Feedback-driven learning, normative constraint enforcement, iterative refinement loops, prompt optimization, and structured output validation. These components enable agents to improve over time and produce higher-quality outputs.
Feedback
Represents user or system feedback on agent output. Four factory methods for common types:
Feedback positive = Feedback.thumbsUp("Great explanation!");
Feedback negative = Feedback.thumbsDown("Too verbose, needs to be concise");
Feedback fix = Feedback.correction("Use formal tone, not casual");
Feedback pref = Feedback.preference("Always include code examples");
// Attach to a session
Feedback withSession = positive.withSessionId("session-123");
feedback.type(); // FeedbackType.POSITIVE
feedback.content(); // "Great explanation!"
feedback.timestamp(); // Instant
feedback.id(); // Auto-generated UUID
feedback.metadata(); // Map<String, Object>FeedbackType
Four feedback types cover the most common ways users and systems respond to agent output.
| Type | Factory | Description |
|---|---|---|
POSITIVE | thumbsUp(comment) | Good output -- reinforce this behavior |
NEGATIVE | thumbsDown(comment) | Bad output -- avoid this in future |
CORRECTION | correction(text) | Specific fix to apply |
PREFERENCE | preference(text) | User style/tone preference |
FeedbackLearner
Analyzes collected feedback and produces actionable learnings: prompt adjustments, user preferences, and good examples for few-shot prompting.
FeedbackLearner learner = FeedbackLearner.builder()
.llm(client)
.feedbackStore(FeedbackStore.inMemory())
.strategies(List.of(
LearningStrategy.PROMPT_ADJUSTMENT,
LearningStrategy.PREFERENCE_LEARNING,
LearningStrategy.EXAMPLE_COLLECTION))
.minFeedbackForLearning(3)
.build();
learner.recordFeedback(Feedback.correction("Use formal tone"));
learner.recordFeedback(Feedback.thumbsDown("Response was too long"));
learner.recordFeedback(Feedback.preference("Include code examples"));
learner.recordFeedback(Feedback.thumbsUp("Perfect level of detail"));
FeedbackLearner.LearningResult result = learner.learn();
result.promptAdjustments(); // ["Use formal tone", "Keep responses concise"]
result.preferences(); // ["User prefers code examples"]
result.goodExamples(); // ["Perfect level of detail"]
result.feedbackAnalyzed(); // 4
result.hasLearnings(); // trueLearningStrategy
Each strategy focuses on a different type of feedback and produces a different kind of actionable output.
| Strategy | Analyzes | Produces |
|---|---|---|
PROMPT_ADJUSTMENT | Negative + correction feedback | Rules for system prompt |
PREFERENCE_LEARNING | Preference + correction feedback | User preference profile |
EXAMPLE_COLLECTION | Positive feedback | Good examples for few-shot |
FeedbackStore
A simple store for collecting feedback items. The in-memory implementation is suitable for testing; for production, persist feedback to your preferred storage backend.
FeedbackStore store = FeedbackStore.inMemory();
store.save(feedback);
List<Feedback> all = store.getAll();
List<Feedback> corrections = store.getByType(FeedbackType.CORRECTION);NormEngine
Runtime enforcement of normative constraints extracted from @Norm and @Norms annotations. Checks actions against obligations, prohibitions, and permissions.
Annotation-Driven Setup
The easiest way to define norms is with @Norms and @Norm annotations on your agent class. The engine reads these at construction time and enforces them at runtime.
@Norms({
@Norm(
type = NormType.PROHIBITION,
action = "sharePersonalData",
condition = "hasConsent == false",
description = "Cannot share personal data without consent",
priority = 10
),
@Norm(
type = NormType.OBLIGATION,
action = "logAccess",
description = "Must log all data access events",
priority = 5
),
@Norm(
type = NormType.PERMISSION,
action = "readPublicData",
description = "Can read public data at any time"
)
})
public class DataAgent { /* ... */ }Using the Engine
Create a NormEngine from annotations or explicit entries, then call checkAction() before performing any action to verify it does not violate any active norms. You can also check whether all obligations have been fulfilled.
// Create from annotations
NormEngine engine = NormEngine.fromAnnotations(DataAgent.class);
// Or from explicit entries
NormEngine engine = NormEngine.of(
new NormEntry(NormType.PROHIBITION, "hasConsent == false",
"sharePersonalData", "No sharing without consent", 10),
new NormEntry(NormType.OBLIGATION, "",
"logAccess", "Must log access", 5)
);
// Check if an action is allowed
Predicate<String> conditionEval = condition ->
ConditionEvaluator.evaluate(condition, currentState);
NormEngine.CheckResult result = engine.checkAction("sharePersonalData", conditionEval);
if (result.isViolation()) {
for (NormViolation v : result.violations()) {
System.out.println("VIOLATION: " + v.description());
}
}
// Check obligation fulfillment
Set<String> fulfilled = Set.of("readData"); // logAccess NOT fulfilled
NormEngine.CheckResult obligations = engine.checkObligations(fulfilled, conditionEval);
if (obligations.isViolation()) {
System.out.println("Unfulfilled obligations found");
}
// Query active norms
List<NormEntry> activeObligations = engine.getActiveObligations(conditionEval);
List<NormEntry> activeProhibitions = engine.getActiveProhibitions(conditionEval);
List<NormEntry> activePermissions = engine.getActivePermissions(conditionEval);
// Add norms dynamically at runtime
engine.addNorm(new NormEntry(NormType.PROHIBITION, "",
"deleteProduction", "Never delete production data", 100));NormEntry
Each norm entry specifies a type (obligation, prohibition, or permission), the action it applies to, an optional condition for when it is active, and a priority for conflict resolution.
public record NormEntry(
NormType type, // OBLIGATION, PROHIBITION, PERMISSION
String condition, // When this norm is active (empty = always)
String action, // The action this norm applies to
String description, // Human-readable explanation
int priority // Higher = more important
) { }NormViolation
When an action violates a norm, the engine returns one or more NormViolation records explaining what went wrong.
public record NormViolation(
NormEntry norm, // The violated norm
String action, // The action that violated it
String description // Explanation of the violation
) { }RefinementLoop
Iterative refinement that repeatedly processes outputs until they meet predefined quality standards. Runs checks after each iteration and re-prompts the LLM with specific failure details.
RefinementLoop loop = RefinementLoop.builder()
.task("Convert Python to TypeScript")
.completionCriteria(CompletionCriteria.builder()
.compilerCheck("tsc --noEmit")
.testCommand("npm test")
.mustNotContain("def ", "import ")
.mustContain("interface", "export")
.validJson()
.build())
.maxIterations(10)
.timeout(Duration.ofMinutes(30))
.onIteration(iter -> log.info("Iteration {} score: {}",
iter.iterationNumber(), iter.evaluation().overallScore()))
.build();
// Execute with an Agent
RefinementResult result = loop.execute(agent, pythonCode);
// Or with an LLMClient directly
RefinementResult result = loop.execute(llmClient, pythonCode);
result.getFinalOutput(); // The refined output
result.getIterations(); // Total iterations run
result.getStatus(); // SUCCESS, MAX_ITERATIONS, TIMEOUT, ERROR, STOPPED
result.getDuration(); // Total time
result.getHistory(); // List of IterationResult per iterationRefinementStatus
The final status tells you why the loop stopped, so you can handle each case appropriately.
| Status | Description |
|---|---|
SUCCESS | All criteria met |
MAX_ITERATIONS | Hit iteration limit |
TIMEOUT | Hit time limit |
ERROR | LLM call failed |
STOPPED | StopHook triggered early |
CompletionCriteria
Defines the quality checks that must all pass for refinement to stop. You can combine compiler checks, content assertions, structural validations, custom predicates, and even LLM-based quality judgments.
CompletionCriteria criteria = CompletionCriteria.builder()
// Shell commands (compiler, test runner, linter)
.compilerCheck("javac -d out *.java")
.testCommand("mvn test -q")
.lintCheck("eslint --quiet .")
// Content presence/absence
.mustContain("public class", "@Override")
.mustNotContain("System.out.println", "TODO")
// Structure checks
.validJson()
.minLines(10)
.wordCount(50, 500)
.matchesPattern("class\\s+\\w+\\s+implements\\s+\\w+")
// Custom predicate
.customCheck("no-any-type", code -> !code.contains(": any"),
"Output must not contain TypeScript 'any' type")
// LLM-based quality check (requires withLLM first)
.withLLM(evalClient)
.llmCheck("Is this idiomatic TypeScript?", 0.8)
.build();
// Evaluate against output
EvaluationResult eval = criteria.evaluate(output);
eval.allCriteriaMet(); // true if all required checks passed
eval.overallScore(); // ratio of passed checks (0.0 to 1.0)
eval.passed(); // List<CheckResult>
eval.failed(); // List<CheckResult>
eval.getFailureReasons(); // List<String> of failure messagesBuilt-in Checks
These are the available check types you can add to CompletionCriteria. Mix and match them to define exactly what "done" means for your use case.
| Method | Description |
|---|---|
compilerCheck(cmd) | Shell command must exit 0 |
testCommand(cmd) | Test suite must pass |
lintCheck(cmd) | Linter must pass |
mustContain(strings...) | All strings must appear in output |
mustNotContain(strings...) | None may appear in output |
validJson() | Output must parse as JSON |
matchesPattern(regex) | Pattern must match |
wordCount(min, max) | Word count in range |
minLines(min) | Minimum line count |
customCheck(name, predicate) | Any Predicate<String> |
llmCheck(question, threshold) | LLM rates quality 0-100, must exceed threshold |
PromptOptimizer
Automated prompt tuning through iterative refinement, strategy selection, and A/B testing against test cases.
PromptOptimizer optimizer = PromptOptimizer.builder()
.llmClient(client)
.maxIterations(5)
.targetScore(0.9f)
.candidateStrategies(List.of(
PromptStrategy.CHAIN_OF_THOUGHT,
PromptStrategy.CHAIN_OF_VERIFICATION,
PromptStrategy.CONFIDENCE_WEIGHTED))
.build();
List<TestCase> testCases = List.of(
new TestCase("What is 2+2?", "4"),
new TestCase("Capital of France?", "Paris")
);
OptimizationResult result = optimizer.optimize("Answer the question:", testCases);
System.out.println("Best prompt: " + result.getBestPrompt());
System.out.println("Score: " + result.getScore());
System.out.println("Iterations: " + result.getIterations());The optimizer tries each candidate strategy, evaluates against test cases, then uses LLM-based refinement to suggest further improvements. It stops when targetScore is reached or maxIterations is exhausted.
Strategy Suggestions
If you are not sure which prompt strategy to use, the optimizer can analyze your task description and suggest strategies that are likely to work well.
List<PromptStrategy> suggested = optimizer.suggestStrategies(
"analyze and solve math problems");
// -> [CHAIN_OF_THOUGHT, STRUCTURED_THINKING]StructuredOutputExecutor
Ensures LLM outputs conform to a target type with automatic validation and retry on failure.
StructuredOutputExecutor<OrderSummary> executor =
StructuredOutputExecutor.<OrderSummary>builder()
.llm(client)
.targetType(OrderSummary.class)
.outputFormat(OutputFormat.JSON)
.rules(List.of(
ValidationRule.notNull("orderId"),
ValidationRule.range("total", 0, 100000),
ValidationRule.pattern("email", ".*@.*\\..*")))
.maxRetries(3)
.systemPrompt("You are an order processing assistant.")
.build();
StructuredOutputResult<OrderSummary> result = executor.generate(
"Summarize this order: customer=John, items=3 widgets at $25 each"
);
if (result.success()) {
OrderSummary order = result.value();
} else {
System.out.println("Failed after " + result.attempts() + " attempts");
System.out.println("Errors: " + result.errors());
}Retry Flow
The executor handles the common problem of LLMs producing malformed or invalid structured output by automatically retrying with targeted error feedback.
- Generate output with format instructions appended to prompt
- Deserialize response to target type
- Run validation rules against deserialized data
- On parse or validation failure: build correction prompt with specific errors
- Repeat up to
maxRetriestimes
The correction prompt includes the specific errors so the LLM can fix them directly rather than guessing what went wrong.
Finite State Machine
Deterministic state machine for bounded agent autonomy. Provides guard-based transitions, entry/exit actions, automatic transitions, event payloads, listeners, and visualization to Mermaid and Graphviz DOT.
Planning
Goal-oriented planning for AI agents. TnsAI provides three planner implementations: annotation-driven backward chaining, utility-based scoring, and LLM-powered dynamic planning with human-in-the-loop approval and adaptive replanning.