Security
The Quality module provides a layered security framework: annotation-driven access control and encryption (`SecurityEnforcer`), content moderation (`PatternBasedModerator`), prompt injection detection (`PromptInjectionDetector`), sandboxed execution, audit logging, and input validation (`ValidationService`).
SecurityEnforcer
Runtime enforcement of policies defined via the @Security annotation. Handles access control, parameter/result encryption (AES-256-GCM), security timeouts, field masking, and audit logging.
// Auto-generated encryption key
SecurityEnforcer enforcer = new SecurityEnforcer();
// Custom encryption config
EncryptionConfig config = EncryptionConfig.fromBase64Key(myKey);
SecurityEnforcer enforcer = new SecurityEnforcer(config);
// Access control
enforcer.enforceAccessControl(action, role, callerAgentId, callerRoles);
// Parameter encryption (if @Security(encryptParams=true))
Map<String, Object> secureParams = enforcer.processParameters(action, role, params);
// Execute with timeout (if @Security(securityTimeout=5000))
Object result = enforcer.executeWithTimeout(action, role, () -> doExecute());
// Result encryption (if @Security(encryptResult=true))
Object secureResult = enforcer.processResult(action, role, result);
// Mask fields for logging (if @Security(maskFields={"password","token"}))
Map<String, Object> safeParams = enforcer.maskForLogging(action, role, params);Access Control Checks
When an action is invoked, the enforcer reads the @Security annotation from the action method (or falls back to the role class) and checks three layers in order. If any layer rejects the caller, a SecurityException is thrown.
- allowedCallers: Whitelist of agent IDs permitted to invoke the action
- allowedRoles: Set of role names -- caller must have at least one
- requiredPermissions: Delegated to a
PermissionProviderSPI implementation (discovered viaServiceLoader)
All three throw SecurityException(ViolationType.UNAUTHORIZED) on failure.
EncryptionConfig
The EncryptionConfig manages the AES-256-GCM encryption keys used by SecurityEnforcer to encrypt and decrypt action parameters and results. You can generate a random key or import an existing one from Base64.
// Generate a random key
EncryptionConfig config = EncryptionConfig.generateKey();
// From Base64-encoded key
EncryptionConfig config = EncryptionConfig.fromBase64Key(base64String);
// Encrypt/decrypt via SecurityEnforcer
String encrypted = enforcer.encrypt("sensitive data");
String decrypted = enforcer.decrypt(encrypted);
// Format: Base64(IV[12 bytes] + ciphertext + authTag[16 bytes])PermissionProvider SPI
If your application uses a custom permission backend (like an RBAC database or an external authorization service), you can integrate it with TnsAI through the PermissionProvider SPI. The enforcer discovers your implementation via ServiceLoader and delegates permission checks to it.
public interface PermissionProvider {
boolean hasPermission(String agentId, String permission);
static PermissionProvider discover() { /* ServiceLoader lookup */ }
}If no implementation is on the classpath, permission checks are skipped with a debug log.
ContentModerator / PatternBasedModerator
The content moderator scans text for harmful or unwanted content using regex patterns. It checks user input, model output, and tool results against configurable categories (violence, hate speech, PII, spam, etc.) and returns a decision: allow, warn, block, or flag for review.
Moderation Categories (11)
The moderator organizes harmful content into 11 categories. Each category has its own set of detection patterns and can be individually enabled or disabled.
| Category | Description |
|---|---|
VIOLENCE | Violent threats, bomb/terrorism references |
SELF_HARM | Self-harm methods, suicide references |
HATE | Hate speech indicators |
HARASSMENT | Personal attacks, stalking, doxxing |
SPAM | Commercial spam patterns |
ILLEGAL | Hacking, counterfeiting, piracy |
PII | Email, phone, SSN, credit card patterns |
SEXUAL | Sexual content |
TOXIC | General toxicity |
MISINFORMATION | Factual inaccuracies |
OTHER | Uncategorized |
ModerationResult Actions
After checking content, the moderator returns one of four actions based on how the content scored against your configured thresholds.
| Action | Meaning |
|---|---|
ALLOW | Content passes all checks |
WARN | Score exceeds warn threshold but below block |
BLOCK | Score exceeds block threshold |
REVIEW | Flagged for manual review |
Usage
You can create a moderator with sensible defaults, use a strict preset for high-risk applications, or fully customize the patterns and thresholds.
// Default patterns (block >= 0.8, warn >= 0.5)
ContentModerator moderator = PatternBasedModerator.withDefaults();
// Strict mode (block >= 0.6, warn >= 0.3)
ContentModerator strict = PatternBasedModerator.strict();
// Custom configuration
ContentModerator custom = PatternBasedModerator.builder()
.addPattern(Category.TOXIC, "badword1|badword2", 0.9)
.addPattern(Category.SPAM, "(buy now|free offer)", 0.7)
.blockThreshold(0.8)
.warnThreshold(0.5)
.checkPII(true)
.enabledCategories(Set.of(Category.VIOLENCE, Category.PII, Category.SPAM))
.build();
// Check content
ModerationResult result = moderator.checkInput(userMessage);
if (result.isBlocked()) {
logger.warn("Blocked: {}", result.getReason());
}
// Separate checks for input, output, and tool results
ModerationResult inputResult = moderator.checkInput(userMessage);
ModerationResult outputResult = moderator.checkOutput(modelResponse);
ModerationResult toolResult = moderator.checkToolResult("shell", shellOutput);
// Tool results use less strict checking (mainly PII detection)PromptInjectionDetector
Detects prompt injection attacks using regex pattern matching across 6 injection types.
Injection Types
The detector recognizes six categories of prompt injection attacks. Each category has multiple regex patterns that catch common variations.
| Type | Detection Patterns |
|---|---|
SYSTEM_PROMPT_OVERRIDE | "Ignore previous instructions", "Disregard all prior", "Override system prompt" |
JAILBREAK | "DAN mode", "Do Anything Now", "Developer mode enabled", "Remove restrictions" |
ROLE_MANIPULATION | "You are now...", "Pretend you are", "Act as if", "Assume the role of" |
INSTRUCTION_BYPASS | "Skip safety filters", "Bypass restrictions", "Without restrictions" |
CONTEXT_MANIPULATION | Fake [SYSTEM] delimiters, markdown system injection, code block system tags |
DATA_EXFILTRATION | "Repeat your instructions", "Output your system prompt", "Dump your memory" |
Usage
Create a detector with a preset sensitivity, or customize it with your own patterns and thresholds. The detector returns a detailed result with matched patterns and confidence scores.
// Standard detector (threshold: 0.6)
PromptInjectionDetector detector = PromptInjectionDetector.standard();
// Strict detector (threshold: 0.5)
PromptInjectionDetector strict = PromptInjectionDetector.strict();
// Throwing detector (raises SecurityException on detection)
PromptInjectionDetector throwing = PromptInjectionDetector.throwing();
// Custom configuration
PromptInjectionDetector custom = PromptInjectionDetector.builder()
.addPattern(InjectionType.JAILBREAK, "my-custom-pattern", 0.9)
.confidenceThreshold(0.5)
.throwOnDetection(true)
.build();
// Check input
InjectionDetectionResult result = detector.detect(userInput);
if (result.isInjectionDetected()) {
System.out.println("Types: " + result.getDetectedTypes());
System.out.println("Patterns: " + result.getMatchedPatterns());
if (result.shouldBlock()) {
// Block at detector's configured threshold
}
}
// Convenience methods
boolean safe = detector.isSafe(input); // No injection detected
boolean block = detector.shouldBlock(input); // Exceeds thresholdSandboxedExecutor
Executes untrusted code in a restricted environment using SandboxSpec constraints.
SandboxSpec spec = SandboxSpec.builder()
.timeout(Duration.ofSeconds(30))
.maxMemory(256_000_000) // 256 MB
.allowNetwork(false)
.allowFileRead(true)
.allowFileWrite(false)
.build();
SandboxedExecutor executor = new SandboxedExecutor(spec);
Object result = executor.execute(() -> runUntrustedCode());Audit Infrastructure
AuditEvent and AuditStore
The audit system records security-relevant events (access decisions, encryption operations, moderation results) for compliance and debugging. You can choose between in-memory storage for development or file-based storage for production.
// In-memory (bounded, for development/testing)
AuditStore store = new InMemoryAuditStore();
// File-based (persistent)
AuditStore store = new FileAuditStore(Path.of("/var/log/tnsai/audit.json"));ValidationService
Comprehensive input validation with configurable limits and dangerous pattern detection.
Detected Dangerous Patterns
The validation service checks input strings against common attack patterns. When blockDangerousPatterns is enabled (the default), inputs containing these patterns are rejected.
| Pattern Type | Examples |
|---|---|
| SQL Injection | UNION SELECT, DROP TABLE, --, INSERT INTO |
| Command Injection | $(...), backticks, ; rm, ; curl, ; sudo |
| Path Traversal | ../, ..\\, %2e%2e%2f |
| XSS | <script>, javascript:, onclick=, <iframe> |
Usage
The validation service provides multiple presets: a permissive default, a strict mode for high-security applications, and a fully customizable builder.
// Default settings
ValidationService validator = ValidationService.withDefaults();
// Strict settings
ValidationService strict = ValidationService.strict();
// maxStringLength: 10,000 maxParameterCount: 20
// maxNestingDepth: 5 maxArraySize: 100
// Custom configuration
ValidationService validator = ValidationService.builder()
.maxStringLength(10_000)
.maxParameterCount(50)
.maxNestingDepth(5)
.blockDangerousPatterns(true)
.allowedFields(Set.of("name", "query", "path"))
.addPattern("email", Pattern.compile("^[^@]+@[^@]+$"))
.build();
// Validate string input
ValidationResult result = validator.validateString(userInput, "message");
if (!result.isValid()) {
result.getErrors().forEach(System.err::println);
}
// Validate parameters (recursive, checks nesting depth)
ValidationResult paramResult = validator.validateParameters(params);
// Validate-and-throw
validator.validateOrThrow(input, "userInput"); // throws SecurityException
// Sanitization (escapes HTML, removes null bytes, truncates)
String safe = validator.sanitize(rawInput);
// Quick danger check
boolean dangerous = validator.containsDangerousPatterns(input);Default Limits
These are the default and strict configuration limits. The strict preset is recommended for applications that accept untrusted user input.
| Setting | Default | Strict |
|---|---|---|
maxStringLength | 100,000 | 10,000 |
maxParameterCount | 100 | 20 |
maxNestingDepth | 10 | 5 |
maxArraySize | 1,000 | 100 |
blockDangerousPatterns | true | true |
SecretFilter (Pre-LLM Scrubbing)
Automatically detects and redacts secrets from content before it reaches the LLM. Uses 59 regex patterns covering API keys, tokens, passwords, connection strings, and other credentials. Package: com.tnsai.quality.security.
SecretFilter filter = new SecretFilter();
String sanitized = filter.scrub(userInput);
// "My key is sk-abc123xyz" -> "My key is [REDACTED:API_KEY]"
// Check without redacting
boolean hasSecrets = filter.containsSecrets(userInput);
// Get detailed findings
List<SecretFinding> findings = filter.scan(userInput);
for (SecretFinding finding : findings) {
System.out.println(finding.type()); // e.g., "AWS_ACCESS_KEY"
System.out.println(finding.location()); // character offset
}Detected secret types include:
| Category | Examples |
|---|---|
| Cloud | AWS access keys, GCP service accounts, Azure connection strings |
| API Keys | OpenAI, Anthropic, Stripe, Twilio, SendGrid, Slack tokens |
| Auth | JWT tokens, OAuth secrets, Bearer tokens, Basic auth credentials |
| Database | JDBC URLs with passwords, MongoDB URIs, Redis URLs |
| Crypto | Private keys (RSA, EC), SSH keys, PGP keys |
| Infrastructure | Docker registry credentials, Kubernetes secrets, Terraform state |
Wire into an agent to automatically scrub all inputs:
Agent agent = AgentBuilder.create()
.model("claude-sonnet-4")
.secretFilter(new SecretFilter())
.build();
// All user inputs and tool results are scrubbed before reaching the LLMPermissionLevel (3-Tier Permissions)
TnsAI uses a simple three-tier permission model to control which tools an agent can execute. This is the foundation for the tool approval system -- each tool is assigned a permission level that determines whether it runs automatically, requires user confirmation, or is blocked entirely.
public enum PermissionLevel {
ALWAYS, // Tool executes without any approval
ASK, // Requires user confirmation before execution
DENY // Tool is blocked entirely
}Configure permissions per tool:
Map<String, PermissionLevel> permissions = Map.of(
"file_read", PermissionLevel.ALWAYS,
"file_write", PermissionLevel.ASK,
"shell", PermissionLevel.ASK,
"git_write", PermissionLevel.DENY
);
ToolFilter toolFilter = ToolFilter.fromPermissions(permissions);
agent.setToolCallFilter(toolFilter);ToolFilter (Glob Allow/Deny)
The ToolFilter lets you define allow and deny rules using glob patterns, so you can control tool access with wildcard matching instead of listing every tool individually. This is useful when you have many tools and want broad category-level permissions.
ToolFilter filter = ToolFilter.builder()
.allow("file_*") // allow all file tools
.allow("search_*") // allow all search tools
.deny("file_write") // except file_write
.deny("shell") // block shell access
.defaultPermission(PermissionLevel.ASK) // ask for unlisted tools
.build();
agent.setToolCallFilter(filter);Rules are evaluated in order: first matching rule wins. The defaultPermission applies when no glob pattern matches.
// Check a tool explicitly
PermissionLevel level = filter.getPermission("file_read"); // ALWAYS
PermissionLevel level = filter.getPermission("shell"); // DENY
PermissionLevel level = filter.getPermission("unknown"); // ASK (default)