Learning and Refinement

Feedback-driven learning, normative constraint enforcement, iterative refinement loops, prompt optimization, and structured output validation. These components enable agents to improve over time and produce higher-quality outputs.

Feedback

Represents user or system feedback on agent output. Four factory methods for common types:

Feedback positive = Feedback.thumbsUp("Great explanation!");
Feedback negative = Feedback.thumbsDown("Too verbose, needs to be concise");
Feedback fix = Feedback.correction("Use formal tone, not casual");
Feedback pref = Feedback.preference("Always include code examples");

// Attach to a session
Feedback withSession = positive.withSessionId("session-123");

feedback.type();       // FeedbackType.POSITIVE
feedback.content();    // "Great explanation!"
feedback.timestamp();  // Instant
feedback.id();         // Auto-generated UUID
feedback.metadata();   // Map<String, Object>

FeedbackType

Four feedback types cover the most common ways users and systems respond to agent output.

Type	Factory	Description
`POSITIVE`	`thumbsUp(comment)`	Good output -- reinforce this behavior
`NEGATIVE`	`thumbsDown(comment)`	Bad output -- avoid this in future
`CORRECTION`	`correction(text)`	Specific fix to apply
`PREFERENCE`	`preference(text)`	User style/tone preference

FeedbackLearner

Analyzes collected feedback and produces actionable learnings: prompt adjustments, user preferences, and good examples for few-shot prompting.

FeedbackLearner learner = FeedbackLearner.builder()
    .llm(client)
    .feedbackStore(FeedbackStore.inMemory())
    .strategies(List.of(
        LearningStrategy.PROMPT_ADJUSTMENT,
        LearningStrategy.PREFERENCE_LEARNING,
        LearningStrategy.EXAMPLE_COLLECTION))
    .minFeedbackForLearning(3)
    .build();

learner.recordFeedback(Feedback.correction("Use formal tone"));
learner.recordFeedback(Feedback.thumbsDown("Response was too long"));
learner.recordFeedback(Feedback.preference("Include code examples"));
learner.recordFeedback(Feedback.thumbsUp("Perfect level of detail"));

FeedbackLearner.LearningResult result = learner.learn();

result.promptAdjustments();  // ["Use formal tone", "Keep responses concise"]
result.preferences();         // ["User prefers code examples"]
result.goodExamples();        // ["Perfect level of detail"]
result.feedbackAnalyzed();    // 4
result.hasLearnings();        // true

LearningStrategy

Each strategy focuses on a different type of feedback and produces a different kind of actionable output.

Strategy	Analyzes	Produces
`PROMPT_ADJUSTMENT`	Negative + correction feedback	Rules for system prompt
`PREFERENCE_LEARNING`	Preference + correction feedback	User preference profile
`EXAMPLE_COLLECTION`	Positive feedback	Good examples for few-shot

FeedbackStore

A simple store for collecting feedback items. The in-memory implementation is suitable for testing; for production, persist feedback to your preferred storage backend.

FeedbackStore store = FeedbackStore.inMemory();
store.save(feedback);
List<Feedback> all = store.getAll();
List<Feedback> corrections = store.getByType(FeedbackType.CORRECTION);

NormEngine

Runtime enforcement of normative constraints extracted from @Norm and @Norms annotations. Checks actions against obligations, prohibitions, and permissions.

Annotation-Driven Setup

The easiest way to define norms is with @Norms and @Norm annotations on your agent class. The engine reads these at construction time and enforces them at runtime.

@Norms({
    @Norm(
        type = NormType.PROHIBITION,
        action = "sharePersonalData",
        condition = "hasConsent == false",
        description = "Cannot share personal data without consent",
        priority = 10
    ),
    @Norm(
        type = NormType.OBLIGATION,
        action = "logAccess",
        description = "Must log all data access events",
        priority = 5
    ),
    @Norm(
        type = NormType.PERMISSION,
        action = "readPublicData",
        description = "Can read public data at any time"
    )
})
public class DataAgent { /* ... */ }

Using the Engine

Create a NormEngine from annotations or explicit entries, then call checkAction() before performing any action to verify it does not violate any active norms. You can also check whether all obligations have been fulfilled.

// Create from annotations
NormEngine engine = NormEngine.fromAnnotations(DataAgent.class);

// Or from explicit entries
NormEngine engine = NormEngine.of(
    new NormEntry(NormType.PROHIBITION, "hasConsent == false",
        "sharePersonalData", "No sharing without consent", 10),
    new NormEntry(NormType.OBLIGATION, "",
        "logAccess", "Must log access", 5)
);

// Check if an action is allowed
Predicate<String> conditionEval = condition ->
    ConditionEvaluator.evaluate(condition, currentState);

NormEngine.CheckResult result = engine.checkAction("sharePersonalData", conditionEval);
if (result.isViolation()) {
    for (NormViolation v : result.violations()) {
        System.out.println("VIOLATION: " + v.description());
    }
}

// Check obligation fulfillment
Set<String> fulfilled = Set.of("readData"); // logAccess NOT fulfilled
NormEngine.CheckResult obligations = engine.checkObligations(fulfilled, conditionEval);
if (obligations.isViolation()) {
    System.out.println("Unfulfilled obligations found");
}

// Query active norms
List<NormEntry> activeObligations = engine.getActiveObligations(conditionEval);
List<NormEntry> activeProhibitions = engine.getActiveProhibitions(conditionEval);
List<NormEntry> activePermissions = engine.getActivePermissions(conditionEval);

// Add norms dynamically at runtime
engine.addNorm(new NormEntry(NormType.PROHIBITION, "",
    "deleteProduction", "Never delete production data", 100));

NormEntry

Each norm entry specifies a type (obligation, prohibition, or permission), the action it applies to, an optional condition for when it is active, and a priority for conflict resolution.

public record NormEntry(
    NormType type,       // OBLIGATION, PROHIBITION, PERMISSION
    String condition,    // When this norm is active (empty = always)
    String action,       // The action this norm applies to
    String description,  // Human-readable explanation
    int priority         // Higher = more important
) { }

NormViolation

When an action violates a norm, the engine returns one or more NormViolation records explaining what went wrong.

public record NormViolation(
    NormEntry norm,      // The violated norm
    String action,       // The action that violated it
    String description   // Explanation of the violation
) { }

RefinementLoop

Iterative refinement that repeatedly processes outputs until they meet predefined quality standards. Runs checks after each iteration and re-prompts the LLM with specific failure details.

RefinementLoop loop = RefinementLoop.builder()
    .task("Convert Python to TypeScript")
    .completionCriteria(CompletionCriteria.builder()
        .compilerCheck("tsc --noEmit")
        .testCommand("npm test")
        .mustNotContain("def ", "import ")
        .mustContain("interface", "export")
        .validJson()
        .build())
    .maxIterations(10)
    .timeout(Duration.ofMinutes(30))
    .onIteration(iter -> log.info("Iteration {} score: {}",
        iter.iterationNumber(), iter.evaluation().overallScore()))
    .build();

// Execute with an Agent
RefinementResult result = loop.execute(agent, pythonCode);

// Or with an LLMClient directly
RefinementResult result = loop.execute(llmClient, pythonCode);

result.getFinalOutput();       // The refined output
result.getIterations();        // Total iterations run
result.getStatus();            // SUCCESS, MAX_ITERATIONS, TIMEOUT, ERROR, STOPPED
result.getDuration();          // Total time
result.getHistory();           // List of IterationResult per iteration

RefinementStatus

The final status tells you why the loop stopped, so you can handle each case appropriately.

Status	Description
`SUCCESS`	All criteria met
`MAX_ITERATIONS`	Hit iteration limit
`TIMEOUT`	Hit time limit
`ERROR`	LLM call failed
`STOPPED`	StopHook triggered early

CompletionCriteria

Defines the quality checks that must all pass for refinement to stop. You can combine compiler checks, content assertions, structural validations, custom predicates, and even LLM-based quality judgments.

CompletionCriteria criteria = CompletionCriteria.builder()
    // Shell commands (compiler, test runner, linter)
    .compilerCheck("javac -d out *.java")
    .testCommand("mvn test -q")
    .lintCheck("eslint --quiet .")

    // Content presence/absence
    .mustContain("public class", "@Override")
    .mustNotContain("System.out.println", "TODO")

    // Structure checks
    .validJson()
    .minLines(10)
    .wordCount(50, 500)
    .matchesPattern("class\\s+\\w+\\s+implements\\s+\\w+")

    // Custom predicate
    .customCheck("no-any-type", code -> !code.contains(": any"),
        "Output must not contain TypeScript 'any' type")

    // LLM-based quality check (requires withLLM first)
    .withLLM(evalClient)
    .llmCheck("Is this idiomatic TypeScript?", 0.8)

    .build();

// Evaluate against output
EvaluationResult eval = criteria.evaluate(output);
eval.allCriteriaMet();       // true if all required checks passed
eval.overallScore();         // ratio of passed checks (0.0 to 1.0)
eval.passed();               // List<CheckResult>
eval.failed();               // List<CheckResult>
eval.getFailureReasons();    // List<String> of failure messages

Built-in Checks

These are the available check types you can add to CompletionCriteria. Mix and match them to define exactly what "done" means for your use case.

Method	Description
`compilerCheck(cmd)`	Shell command must exit 0
`testCommand(cmd)`	Test suite must pass
`lintCheck(cmd)`	Linter must pass
`mustContain(strings...)`	All strings must appear in output
`mustNotContain(strings...)`	None may appear in output
`validJson()`	Output must parse as JSON
`matchesPattern(regex)`	Pattern must match
`wordCount(min, max)`	Word count in range
`minLines(min)`	Minimum line count
`customCheck(name, predicate)`	Any `Predicate<String>`
`llmCheck(question, threshold)`	LLM rates quality 0-100, must exceed threshold

PromptOptimizer

Automated prompt tuning through iterative refinement, strategy selection, and A/B testing against test cases.

PromptOptimizer optimizer = PromptOptimizer.builder()
    .llmClient(client)
    .maxIterations(5)
    .targetScore(0.9f)
    .candidateStrategies(List.of(
        PromptStrategy.CHAIN_OF_THOUGHT,
        PromptStrategy.CHAIN_OF_VERIFICATION,
        PromptStrategy.CONFIDENCE_WEIGHTED))
    .build();

List<TestCase> testCases = List.of(
    new TestCase("What is 2+2?", "4"),
    new TestCase("Capital of France?", "Paris")
);

OptimizationResult result = optimizer.optimize("Answer the question:", testCases);
System.out.println("Best prompt: " + result.getBestPrompt());
System.out.println("Score: " + result.getScore());
System.out.println("Iterations: " + result.getIterations());

The optimizer tries each candidate strategy, evaluates against test cases, then uses LLM-based refinement to suggest further improvements. It stops when targetScore is reached or maxIterations is exhausted.

Strategy Suggestions

If you are not sure which prompt strategy to use, the optimizer can analyze your task description and suggest strategies that are likely to work well.

List<PromptStrategy> suggested = optimizer.suggestStrategies(
    "analyze and solve math problems");
// -> [CHAIN_OF_THOUGHT, STRUCTURED_THINKING]

StructuredOutputExecutor

Ensures LLM outputs conform to a target type with automatic validation and retry on failure.

StructuredOutputExecutor<OrderSummary> executor =
    StructuredOutputExecutor.<OrderSummary>builder()
        .llm(client)
        .targetType(OrderSummary.class)
        .outputFormat(OutputFormat.JSON)
        .rules(List.of(
            ValidationRule.notNull("orderId"),
            ValidationRule.range("total", 0, 100000),
            ValidationRule.pattern("email", ".*@.*\\..*")))
        .maxRetries(3)
        .systemPrompt("You are an order processing assistant.")
        .build();

StructuredOutputResult<OrderSummary> result = executor.generate(
    "Summarize this order: customer=John, items=3 widgets at $25 each"
);

if (result.success()) {
    OrderSummary order = result.value();
} else {
    System.out.println("Failed after " + result.attempts() + " attempts");
    System.out.println("Errors: " + result.errors());
}

Retry Flow

The executor handles the common problem of LLMs producing malformed or invalid structured output by automatically retrying with targeted error feedback.

Generate output with format instructions appended to prompt
Deserialize response to target type
Run validation rules against deserialized data
On parse or validation failure: build correction prompt with specific errors
Repeat up to maxRetries times

The correction prompt includes the specific errors so the LLM can fix them directly rather than guessing what went wrong.

Learning and Refinement

On this page