Security

The Quality module provides a layered security framework: annotation-driven access control and encryption (`SecurityEnforcer`), content moderation (`PatternBasedModerator`), prompt injection detection (`PromptInjectionDetector`), sandboxed execution, audit logging, and input validation (`ValidationService`).

SecurityEnforcer

Runtime enforcement of policies defined via the @Security annotation. Handles access control, parameter/result encryption (AES-256-GCM), security timeouts, field masking, and audit logging.

// Auto-generated encryption key
SecurityEnforcer enforcer = new SecurityEnforcer();

// Custom encryption config
EncryptionConfig config = EncryptionConfig.fromBase64Key(myKey);
SecurityEnforcer enforcer = new SecurityEnforcer(config);

// Access control
enforcer.enforceAccessControl(action, role, callerAgentId, callerRoles);

// Parameter encryption (if @Security(encryptParams=true))
Map<String, Object> secureParams = enforcer.processParameters(action, role, params);

// Execute with timeout (if @Security(securityTimeout=5000))
Object result = enforcer.executeWithTimeout(action, role, () -> doExecute());

// Result encryption (if @Security(encryptResult=true))
Object secureResult = enforcer.processResult(action, role, result);

// Mask fields for logging (if @Security(maskFields={"password","token"}))
Map<String, Object> safeParams = enforcer.maskForLogging(action, role, params);

Access Control Checks

When an action is invoked, the enforcer reads the @Security annotation from the action method (or falls back to the role class) and checks three layers in order. If any layer rejects the caller, a SecurityException is thrown.

allowedCallers: Whitelist of agent IDs permitted to invoke the action
allowedRoles: Set of role names -- caller must have at least one
requiredPermissions: Delegated to a PermissionProvider SPI implementation (discovered via ServiceLoader)

All three throw SecurityException(ViolationType.UNAUTHORIZED) on failure.

EncryptionConfig

The EncryptionConfig manages the AES-256-GCM encryption keys used by SecurityEnforcer to encrypt and decrypt action parameters and results. You can generate a random key or import an existing one from Base64.

// Generate a random key
EncryptionConfig config = EncryptionConfig.generateKey();

// From Base64-encoded key
EncryptionConfig config = EncryptionConfig.fromBase64Key(base64String);

// Encrypt/decrypt via SecurityEnforcer
String encrypted = enforcer.encrypt("sensitive data");
String decrypted = enforcer.decrypt(encrypted);
// Format: Base64(IV[12 bytes] + ciphertext + authTag[16 bytes])

PermissionProvider SPI

If your application uses a custom permission backend (like an RBAC database or an external authorization service), you can integrate it with TnsAI through the PermissionProvider SPI. The enforcer discovers your implementation via ServiceLoader and delegates permission checks to it.

public interface PermissionProvider {
    boolean hasPermission(String agentId, String permission);
    static PermissionProvider discover() { /* ServiceLoader lookup */ }
}

If no implementation is on the classpath, permission checks are skipped with a debug log.

ContentModerator / PatternBasedModerator

The content moderator scans text for harmful or unwanted content using regex patterns. It checks user input, model output, and tool results against configurable categories (violence, hate speech, PII, spam, etc.) and returns a decision: allow, warn, block, or flag for review.

Moderation Categories (11)

The moderator organizes harmful content into 11 categories. Each category has its own set of detection patterns and can be individually enabled or disabled.

Category	Description
`VIOLENCE`	Violent threats, bomb/terrorism references
`SELF_HARM`	Self-harm methods, suicide references
`HATE`	Hate speech indicators
`HARASSMENT`	Personal attacks, stalking, doxxing
`SPAM`	Commercial spam patterns
`ILLEGAL`	Hacking, counterfeiting, piracy
`PII`	Email, phone, SSN, credit card patterns
`SEXUAL`	Sexual content
`TOXIC`	General toxicity
`MISINFORMATION`	Factual inaccuracies
`OTHER`	Uncategorized

ModerationResult Actions

After checking content, the moderator returns one of four actions based on how the content scored against your configured thresholds.

Action	Meaning
`ALLOW`	Content passes all checks
`WARN`	Score exceeds warn threshold but below block
`BLOCK`	Score exceeds block threshold
`REVIEW`	Flagged for manual review

Usage

You can create a moderator with sensible defaults, use a strict preset for high-risk applications, or fully customize the patterns and thresholds.

// Default patterns (block >= 0.8, warn >= 0.5)
ContentModerator moderator = PatternBasedModerator.withDefaults();

// Strict mode (block >= 0.6, warn >= 0.3)
ContentModerator strict = PatternBasedModerator.strict();

// Custom configuration
ContentModerator custom = PatternBasedModerator.builder()
    .addPattern(Category.TOXIC, "badword1|badword2", 0.9)
    .addPattern(Category.SPAM, "(buy now|free offer)", 0.7)
    .blockThreshold(0.8)
    .warnThreshold(0.5)
    .checkPII(true)
    .enabledCategories(Set.of(Category.VIOLENCE, Category.PII, Category.SPAM))
    .build();

// Check content
ModerationResult result = moderator.checkInput(userMessage);
if (result.isBlocked()) {
    logger.warn("Blocked: {}", result.getReason());
}

// Separate checks for input, output, and tool results
ModerationResult inputResult = moderator.checkInput(userMessage);
ModerationResult outputResult = moderator.checkOutput(modelResponse);
ModerationResult toolResult = moderator.checkToolResult("shell", shellOutput);
// Tool results use less strict checking (mainly PII detection)

PromptInjectionDetector

Detects prompt injection attacks using regex pattern matching across 6 injection types.

Injection Types

The detector recognizes six categories of prompt injection attacks. Each category has multiple regex patterns that catch common variations.

Type	Detection Patterns
`SYSTEM_PROMPT_OVERRIDE`	"Ignore previous instructions", "Disregard all prior", "Override system prompt"
`JAILBREAK`	"DAN mode", "Do Anything Now", "Developer mode enabled", "Remove restrictions"
`ROLE_MANIPULATION`	"You are now...", "Pretend you are", "Act as if", "Assume the role of"
`INSTRUCTION_BYPASS`	"Skip safety filters", "Bypass restrictions", "Without restrictions"
`CONTEXT_MANIPULATION`	Fake `[SYSTEM]` delimiters, markdown system injection, code block system tags
`DATA_EXFILTRATION`	"Repeat your instructions", "Output your system prompt", "Dump your memory"

Usage

Create a detector with a preset sensitivity, or customize it with your own patterns and thresholds. The detector returns a detailed result with matched patterns and confidence scores.

// Standard detector (threshold: 0.6)
PromptInjectionDetector detector = PromptInjectionDetector.standard();

// Strict detector (threshold: 0.5)
PromptInjectionDetector strict = PromptInjectionDetector.strict();

// Throwing detector (raises SecurityException on detection)
PromptInjectionDetector throwing = PromptInjectionDetector.throwing();

// Custom configuration
PromptInjectionDetector custom = PromptInjectionDetector.builder()
    .addPattern(InjectionType.JAILBREAK, "my-custom-pattern", 0.9)
    .confidenceThreshold(0.5)
    .throwOnDetection(true)
    .build();

// Check input
InjectionDetectionResult result = detector.detect(userInput);
if (result.isInjectionDetected()) {
    System.out.println("Types: " + result.getDetectedTypes());
    System.out.println("Patterns: " + result.getMatchedPatterns());
    if (result.shouldBlock()) {
        // Block at detector's configured threshold
    }
}

// Convenience methods
boolean safe = detector.isSafe(input);       // No injection detected
boolean block = detector.shouldBlock(input);  // Exceeds threshold

SandboxedExecutor

Executes untrusted code in a restricted environment using SandboxSpec constraints.

SandboxSpec spec = SandboxSpec.builder()
    .timeout(Duration.ofSeconds(30))
    .maxMemory(256_000_000)   // 256 MB
    .allowNetwork(false)
    .allowFileRead(true)
    .allowFileWrite(false)
    .build();

SandboxedExecutor executor = new SandboxedExecutor(spec);
Object result = executor.execute(() -> runUntrustedCode());

Audit Infrastructure

AuditEvent and AuditStore

The audit system records security-relevant events (access decisions, encryption operations, moderation results) for compliance and debugging. You can choose between in-memory storage for development or file-based storage for production.

// In-memory (bounded, for development/testing)
AuditStore store = new InMemoryAuditStore();

// File-based (persistent)
AuditStore store = new FileAuditStore(Path.of("/var/log/tnsai/audit.json"));

ValidationService

Comprehensive input validation with configurable limits and dangerous pattern detection.

Detected Dangerous Patterns

The validation service checks input strings against common attack patterns. When blockDangerousPatterns is enabled (the default), inputs containing these patterns are rejected.

Pattern Type	Examples
SQL Injection	`UNION SELECT`, `DROP TABLE`, `--`, `INSERT INTO`
Command Injection	`$(...)`, backticks, `; rm`, `; curl`, `; sudo`
Path Traversal	`../`, `..\\`, `%2e%2e%2f`
XSS	`<script>`, `javascript:`, `onclick=`, `<iframe>`

Usage

The validation service provides multiple presets: a permissive default, a strict mode for high-security applications, and a fully customizable builder.

// Default settings
ValidationService validator = ValidationService.withDefaults();

// Strict settings
ValidationService strict = ValidationService.strict();
// maxStringLength: 10,000  maxParameterCount: 20
// maxNestingDepth: 5       maxArraySize: 100

// Custom configuration
ValidationService validator = ValidationService.builder()
    .maxStringLength(10_000)
    .maxParameterCount(50)
    .maxNestingDepth(5)
    .blockDangerousPatterns(true)
    .allowedFields(Set.of("name", "query", "path"))
    .addPattern("email", Pattern.compile("^[^@]+@[^@]+$"))
    .build();

// Validate string input
ValidationResult result = validator.validateString(userInput, "message");
if (!result.isValid()) {
    result.getErrors().forEach(System.err::println);
}

// Validate parameters (recursive, checks nesting depth)
ValidationResult paramResult = validator.validateParameters(params);

// Validate-and-throw
validator.validateOrThrow(input, "userInput");  // throws SecurityException

// Sanitization (escapes HTML, removes null bytes, truncates)
String safe = validator.sanitize(rawInput);

// Quick danger check
boolean dangerous = validator.containsDangerousPatterns(input);

Default Limits

These are the default and strict configuration limits. The strict preset is recommended for applications that accept untrusted user input.

Setting	Default	Strict
`maxStringLength`	100,000	10,000
`maxParameterCount`	100	20
`maxNestingDepth`	10	5
`maxArraySize`	1,000	100
`blockDangerousPatterns`	true	true

SecretFilter (Pre-LLM Scrubbing)

Automatically detects and redacts secrets from content before it reaches the LLM. Uses 59 regex patterns covering API keys, tokens, passwords, connection strings, and other credentials. Package: com.tnsai.quality.security.

SecretFilter filter = new SecretFilter();

String sanitized = filter.scrub(userInput);
// "My key is sk-abc123xyz" -> "My key is [REDACTED:API_KEY]"

// Check without redacting
boolean hasSecrets = filter.containsSecrets(userInput);

// Get detailed findings
List<SecretFinding> findings = filter.scan(userInput);
for (SecretFinding finding : findings) {
    System.out.println(finding.type());      // e.g., "AWS_ACCESS_KEY"
    System.out.println(finding.location());  // character offset
}

Detected secret types include:

Category	Examples
Cloud	AWS access keys, GCP service accounts, Azure connection strings
API Keys	OpenAI, Anthropic, Stripe, Twilio, SendGrid, Slack tokens
Auth	JWT tokens, OAuth secrets, Bearer tokens, Basic auth credentials
Database	JDBC URLs with passwords, MongoDB URIs, Redis URLs
Crypto	Private keys (RSA, EC), SSH keys, PGP keys
Infrastructure	Docker registry credentials, Kubernetes secrets, Terraform state

Wire into an agent to automatically scrub all inputs:

Agent agent = AgentBuilder.create()
    .model("claude-sonnet-4")
    .secretFilter(new SecretFilter())
    .build();
// All user inputs and tool results are scrubbed before reaching the LLM

PermissionLevel (3-Tier Permissions)

TnsAI uses a simple three-tier permission model to control which tools an agent can execute. This is the foundation for the tool approval system -- each tool is assigned a permission level that determines whether it runs automatically, requires user confirmation, or is blocked entirely.

public enum PermissionLevel {
    ALWAYS,  // Tool executes without any approval
    ASK,     // Requires user confirmation before execution
    DENY     // Tool is blocked entirely
}

Configure permissions per tool:

Map<String, PermissionLevel> permissions = Map.of(
    "file_read", PermissionLevel.ALWAYS,
    "file_write", PermissionLevel.ASK,
    "shell", PermissionLevel.ASK,
    "git_write", PermissionLevel.DENY
);

ToolFilter toolFilter = ToolFilter.fromPermissions(permissions);
agent.setToolCallFilter(toolFilter);

ToolFilter (Glob Allow/Deny)

The ToolFilter lets you define allow and deny rules using glob patterns, so you can control tool access with wildcard matching instead of listing every tool individually. This is useful when you have many tools and want broad category-level permissions.

ToolFilter filter = ToolFilter.builder()
    .allow("file_*")           // allow all file tools
    .allow("search_*")         // allow all search tools
    .deny("file_write")        // except file_write
    .deny("shell")             // block shell access
    .defaultPermission(PermissionLevel.ASK)  // ask for unlisted tools
    .build();

agent.setToolCallFilter(filter);

Rules are evaluated in order: first matching rule wins. The defaultPermission applies when no glob pattern matches.

// Check a tool explicitly
PermissionLevel level = filter.getPermission("file_read");  // ALWAYS
PermissionLevel level = filter.getPermission("shell");      // DENY
PermissionLevel level = filter.getPermission("unknown");    // ASK (default)

On this page