ELIDA Policy Rules Reference¶

This document contains the complete set of default policy rules available in ELIDA's presets. These rules are built into the code and activated via the preset configuration option.

Available Presets¶

Preset	Description	Use Case
`minimal`	Basic rate limiting only	Development/testing
`standard`	OWASP LLM Top 10 basics + rate limits	Production default
`strict`	Full OWASP + NIST + PII detection	High-security environments

Configuration¶

policy:
  enabled: true
  preset: standard  # Use built-in rules

  # Add custom rules (appended to preset):
  rules:
    - name: "my_custom_rule"
      type: "content_match"
      target: "request"
      patterns: ["my-pattern"]
      severity: "warning"
      action: "flag"
      description: "My custom rule"

Rule Target Options¶

Target	Description	Use Case
`request`	Only scan request bodies	Prompt injection, malicious input
`response`	Only scan response bodies	XSS, dangerous code in output
`both`	Scan both (default if omitted)	PII, credentials

Streaming Response Behavior¶

For response-side rules with streaming responses: - flag action: Response streams normally, async scan after delivery (no latency impact) - block action: Response buffered before delivery (adds latency but prevents delivery)

Recommendation: Use flag for monitoring, block only for critical patterns where you accept the latency cost.

Rate Limiting / Firewall Rules¶

These rules protect against resource exhaustion and runaway agents.

- name: "rate_limit_high"
  type: "requests_per_minute"
  threshold: 60
  severity: "critical"
  action: "block"
  description: "FIREWALL: Request rate exceeds 60/min - potential DoS or runaway agent"

- name: "rate_limit_warning"
  type: "requests_per_minute"
  threshold: 30
  severity: "warning"
  action: "flag"
  description: "FIREWALL: Elevated request rate (30/min)"

- name: "high_request_count"
  type: "request_count"
  threshold: 100
  severity: "warning"
  action: "flag"
  description: "FIREWALL: Session exceeded 100 requests"

- name: "very_high_request_count"
  type: "request_count"
  threshold: 500
  severity: "critical"
  action: "block"
  description: "FIREWALL: Session exceeded 500 requests - potential abuse"

- name: "long_running_session"
  type: "duration"
  threshold: 1800  # 30 minutes
  severity: "warning"
  action: "flag"
  description: "FIREWALL: Session running longer than 30 minutes"

- name: "excessive_session_duration"
  type: "duration"
  threshold: 3600  # 1 hour
  severity: "critical"
  action: "block"
  description: "FIREWALL: Session exceeded 1 hour - likely runaway agent"

- name: "large_response"
  type: "bytes_out"
  threshold: 10485760  # 10MB
  severity: "warning"
  action: "flag"
  description: "FIREWALL: Large data transfer (>10MB response)"

- name: "excessive_data_transfer"
  type: "bytes_total"
  threshold: 52428800  # 50MB
  severity: "critical"
  action: "block"
  description: "FIREWALL: Excessive data transfer (>50MB total)"

Statistical Anomaly Detection¶

These rules use statistical methods to detect abnormal session behavior that evades static thresholds. Designed for agentic traffic where legitimate bursts are common.

Rate Anomaly (Poisson)¶

End-of-session retrospective check. Splits request timestamps into baseline and test windows, uses Poisson survival to detect statistically abnormal rate spikes.

- name: "rate_anomaly"
  type: "rate_anomaly"
  threshold_float: 0.01   # p-value threshold (lower = more sensitive)
  min_samples: 10          # Minimum requests before evaluating
  severity: "warning"
  action: "flag"
  description: "ANOMALY: Request rate statistically abnormal (p<0.01)"

Presets: standard (p<0.01, warning), strict (p<0.001, critical)

Compound Anomaly (Adaptive CUSUM + Entropy)¶

Real-time, agent-first detection. Uses adaptive CUSUM for rate tracking and Shannon entropy for content analysis. Only alarms when both rate and entropy are elevated simultaneously -- eliminating false positives from normal agent execution bursts (high rate, low entropy).

- name: "compound_anomaly"
  type: "compound_anomaly"
  threshold_float: 0.15    # Compound score threshold (rate_score * entropy_score)
  min_samples: 5           # Warmup requests per burst before evaluating
  severity: "warning"
  action: "flag"
  description: "ANOMALY: Sustained high-rate + high-entropy burst"

How it works: - EMA rate tracking (half-life 5s) adapts to phase transitions (plan/execute/review) - CUSUM accumulation (slack 3.0, threshold 10.0) builds evidence of sustained deviation - Incremental Shannon entropy tracks content character (structured JSON ~4.0 vs base64 ~6.0) - Multiplicative scoring: score = rate_score * entropy_score -- any normal signal zeroes the score

Presets: standard (threshold 0.15, flag), strict (threshold 0.10, block)

Content Entropy (Shannon)¶

Standalone entropy check for detecting obfuscated/encoded content. Strict preset only -- code content can naturally reach 5.0-5.5 and cause false positives.

- name: "content_entropy_high"
  type: "content_entropy"
  target: "request"
  threshold_float: 5.5     # Bits per byte (base64 ~6.0, English ~4.0)
  min_samples: 50           # Minimum content bytes before evaluating
  severity: "warning"
  action: "flag"
  description: "ANOMALY: High entropy content detected"

Entropy reference values:

Content Type	Entropy (bits/byte)	Detected?
Structured JSON	4.0-4.5	No
Natural English	4.0-4.5	No
Hex-encoded	3.9-4.0	No (use `encoding_evasion` regex)
URL-encoded	4.5-5.2	Borderline
Base64-encoded	5.8-6.0	Yes
Compressed/encrypted	7.5-8.0	Yes

Presets: strict only

OWASP LLM01 - Prompt Injection (REQUEST-SIDE)¶

Detects attempts to override system prompts or manipulate model behavior.

- name: "prompt_injection_ignore"
  type: "content_match"
  target: "request"
  patterns:
    - "ignore\\s+(all\\s+)?(previous|prior|above)\\s+(instructions|prompts|rules)"
    - "disregard\\s+(all\\s+)?(previous|prior|system)\\s+(instructions|prompts)"
    - "forget\\s+(all\\s+)?(previous|prior|your)\\s+(instructions|training|rules)"
    - "override\\s+(system|previous|all)\\s+(prompt|instructions)"
  severity: "critical"
  action: "block"
  description: "LLM01: Prompt injection attempt - instruction override"

- name: "prompt_injection_jailbreak"
  type: "content_match"
  target: "request"
  patterns:
    - "you\\s+are\\s+now\\s+(DAN|a\\s+new|an?\\s+unrestricted)"
    - "pretend\\s+(you\\s+are|to\\s+be)\\s+(DAN|unrestricted|unfiltered)"
    - "act\\s+as\\s+(DAN|an?\\s+unrestricted|an?\\s+unfiltered)"
    - "enable\\s+(DAN|developer|jailbreak)\\s+mode"
    - "DAN\\s+mode\\s+(enabled|activated|on)"
    - "jailbreak(ed)?\\s+(mode|prompt|enabled)"
  severity: "critical"
  action: "terminate"
  description: "LLM01: Prompt injection attempt - jailbreak/DAN pattern"

- name: "prompt_injection_system"
  type: "content_match"
  target: "request"
  patterns:
    - "\\[system\\].*\\[/system\\]"
    - "<system>.*</system>"
    - "system\\s*:\\s*you\\s+are"
    - "new\\s+system\\s+prompt"
    - "inject.*system.*prompt"
  severity: "critical"
  action: "block"
  description: "LLM01: Prompt injection attempt - system prompt manipulation"

- name: "prompt_injection_delimiter"
  type: "content_match"
  target: "request"
  patterns:
    - "```\\s*(system|admin|root)"
    - "---+\\s*(new\\s+instructions|ignore\\s+above)"
    - "\\*\\*\\*\\s*(admin|system|override)"
  severity: "warning"
  action: "flag"
  description: "LLM01: Suspicious delimiter pattern in prompt"

OWASP LLM02 - Insecure Output Handling (RESPONSE-SIDE)¶

Detects dangerous content in LLM responses that could be executed by clients.

- name: "output_script_injection"
  type: "content_match"
  target: "response"
  patterns:
    - "<script[^>]*>"
    - "javascript:"
    - "on(click|load|error|mouseover)\\s*="
    - "eval\\s*\\("
    - "document\\.(write|cookie|location)"
  severity: "warning"
  action: "flag"
  description: "LLM02: Response contains potential XSS/script injection"

- name: "output_sql_content"
  type: "content_match"
  target: "response"
  patterns:
    - "(?i)(insert|update|delete|drop|alter|create)\\s+(into|from|table|database)"
    - "(?i)select\\s+.+\\s+from\\s+.+\\s+where"
    - "(?i);\\s*(drop|delete|truncate)\\s+"
  severity: "warning"
  action: "flag"
  description: "LLM02: Response contains SQL statements"

- name: "output_shell_commands"
  type: "content_match"
  target: "response"
  patterns:
    - "\\$\\s*\\(\\s*(curl|wget|bash|sh)\\s+"
    - "&&\\s*(rm|chmod|chown|sudo)\\s+"
    - "\\|\\s*(bash|sh|python|perl|ruby)\\s*$"
  severity: "warning"
  action: "flag"
  description: "LLM02: Response contains shell command patterns"

- name: "output_dangerous_code"
  type: "content_match"
  target: "response"
  patterns:
    - "pickle\\.loads"
    - "yaml\\.unsafe_load"
    - "eval\\s*\\(.*input"
    - "exec\\s*\\(.*input"
    - "__import__\\s*\\("
  severity: "critical"
  action: "flag"
  description: "LLM02: Response contains unsafe deserialization patterns"

OWASP LLM06 - Sensitive Information Disclosure (BOTH)¶

Detects requests that may expose PII or sensitive data.

- name: "pii_ssn_request"
  type: "content_match"
  target: "both"
  patterns:
    - "social\\s+security\\s+(number|#)"
    - "\\bssn\\b"
    - "\\d{3}-\\d{2}-\\d{4}"
  severity: "warning"
  action: "flag"
  description: "LLM06: SSN pattern detected"

- name: "pii_credit_card"
  type: "content_match"
  target: "both"
  patterns:
    - "credit\\s+card\\s+(number|#|info)"
    - "\\b(?:4[0-9]{12}(?:[0-9]{3})?|5[1-5][0-9]{14}|3[47][0-9]{13})\\b"
    - "\\bcvv\\b"
    - "\\bcvc\\b"
  severity: "warning"
  action: "flag"
  description: "LLM06: Credit card pattern detected"

- name: "credentials_request"
  type: "content_match"
  target: "request"
  patterns:
    - "(show|give|list|extract)\\s+(me\\s+)?(the\\s+)?api[_\\s]?key"
    - "(show|give|list|extract)\\s+(me\\s+)?(the\\s+)?password"
    - "(read|show|cat|display)\\s+(the\\s+)?\\.env\\s+file"
    - "(list|show|dump)\\s+(all\\s+)?credentials"
  severity: "warning"
  action: "flag"
  description: "LLM06: Credentials request"

- name: "pii_bulk_extraction"
  type: "content_match"
  target: "request"
  patterns:
    - "(list|show|give|extract)\\s+(all\\s+)?(user|customer|employee)\\s+(data|info|records)"
    - "dump\\s+(the\\s+)?(database|user\\s+table|customer\\s+data)"
  severity: "warning"
  action: "flag"
  description: "LLM06: Bulk data extraction request"

OWASP LLM07 - Insecure Plugin Design (REQUEST-SIDE)¶

Monitors tool/function calling patterns for security issues.

- name: "tool_code_execution"
  type: "content_match"
  target: "request"
  patterns:
    - "\"function\"\\s*:\\s*\"(run|execute|eval)_code\""
    - "\"name\"\\s*:\\s*\"(code_interpreter|execute_python|run_script)\""
    - "\"type\"\\s*:\\s*\"code_interpreter\""
  severity: "critical"
  action: "flag"
  description: "LLM07: Tool requests code execution"

- name: "tool_credential_access"
  type: "content_match"
  target: "request"
  patterns:
    - "\"function\"\\s*:\\s*\"(get|read|fetch)_(secret|credential|password|key)\""
    - "\"name\"\\s*:\\s*\"(vault_read|secret_manager|get_api_key)\""
  severity: "critical"
  action: "block"
  description: "LLM07: Tool requests credential access"

- name: "tool_file_access"
  type: "content_match"
  target: "request"
  patterns:
    - "\"function\"\\s*:\\s*\"(read|write|delete|create)_file\""
    - "\"name\"\\s*:\\s*\"file_(read|write|delete|access)\""
  severity: "warning"
  action: "flag"
  description: "LLM07: Tool requests file system access"

- name: "tool_network_access"
  type: "content_match"
  target: "request"
  patterns:
    - "\"function\"\\s*:\\s*\"(http_request|fetch|curl|wget)\""
    - "\"name\"\\s*:\\s*\"(web_request|api_call|http_get|http_post)\""
  severity: "warning"
  action: "flag"
  description: "LLM07: Tool requests network access"

- name: "tool_database_access"
  type: "content_match"
  target: "request"
  patterns:
    - "\"function\"\\s*:\\s*\"(query|sql|database)_\""
    - "\"name\"\\s*:\\s*\"(run_sql|db_query|execute_query)\""
  severity: "warning"
  action: "flag"
  description: "LLM07: Tool requests database access"

OWASP LLM08 - Excessive Agency (REQUEST-SIDE)¶

Detects requests for dangerous system access.

- name: "shell_execution"
  type: "content_match"
  target: "request"
  patterns:
    - "(run|execute)\\s+(a\\s+)?(bash|shell|terminal)\\s+(command|script)"
    - "bash\\s+-c\\s+"
    - "/bin/(ba)?sh\\s+"
  severity: "critical"
  action: "block"
  description: "LLM08: Shell execution request"

- name: "destructive_file_ops"
  type: "content_match"
  target: "request"
  patterns:
    - "rm\\s+(-rf?|--recursive)\\s+/"
    - "rm\\s+-rf\\s+\\*"
    - "(delete|remove|wipe)\\s+all\\s+(files|data|everything)"
  severity: "critical"
  action: "terminate"
  description: "LLM08: Destructive file operation"

- name: "privilege_escalation"
  type: "content_match"
  target: "request"
  patterns:
    - "sudo\\s+"
    - "(run|execute)\\s+(as|with)\\s+root"
    - "privilege\\s+(escalation|elevation)"
  severity: "critical"
  action: "block"
  description: "LLM08: Privilege escalation attempt"

- name: "network_exfiltration"
  type: "content_match"
  target: "request"
  patterns:
    - "curl.*\\|\\s*(ba)?sh"
    - "wget.*\\|\\s*(ba)?sh"
    - "reverse\\s+shell"
  severity: "critical"
  action: "terminate"
  description: "LLM08: Data exfiltration attempt"

- name: "sql_injection"
  type: "content_match"
  target: "request"
  patterns:
    - "drop\\s+(table|database)\\s+"
    - ";\\s*(drop|delete|truncate|update)\\s+"
    - "union\\s+select"
    - "'\\s*or\\s+'?1'?\\s*=\\s*'?1"
  severity: "critical"
  action: "terminate"
  description: "LLM08: SQL injection attempt"

- name: "network_scanning"
  type: "content_match"
  target: "request"
  patterns:
    - "nmap\\s+"
    - "port\\s+scan"
    - "(scan|enumerate)\\s+(the\\s+)?(network|ports|hosts)"
  severity: "warning"
  action: "flag"
  description: "LLM08: Network reconnaissance"

OWASP LLM09 - Overreliance Mitigation¶

Decision audit logging and confidence tracking.

- name: "high_stakes_medical"
  type: "content_match"
  target: "both"
  patterns:
    - "(diagnose|diagnosis|treatment|prognosis)\\s+(for|of)"
    - "should\\s+I\\s+(take|stop|start)\\s+.*(medication|medicine|drug)"
    - "(medical|health)\\s+(advice|recommendation|decision)"
  severity: "warning"
  action: "flag"
  description: "LLM09: High-stakes medical domain - requires human verification"

- name: "high_stakes_legal"
  type: "content_match"
  target: "both"
  patterns:
    - "(legal|law)\\s+(advice|recommendation|opinion)"
    - "(sue|lawsuit|litigation|liability)\\s+"
    - "is\\s+(this|it)\\s+(legal|illegal|lawful|unlawful)"
  severity: "warning"
  action: "flag"
  description: "LLM09: High-stakes legal domain - requires human verification"

- name: "high_stakes_financial"
  type: "content_match"
  target: "both"
  patterns:
    - "(invest|investment|portfolio)\\s+(advice|recommendation|decision)"
    - "should\\s+I\\s+(buy|sell|hold|invest)\\s+"
    - "(financial|money|investment)\\s+(advice|decision|recommendation)"
  severity: "warning"
  action: "flag"
  description: "LLM09: High-stakes financial domain - requires human verification"

- name: "low_confidence_hedging"
  type: "content_match"
  target: "response"
  patterns:
    - "I('m|\\s+am)\\s+not\\s+(sure|certain|confident)"
    - "I\\s+(think|believe|guess|suppose)\\s+(that\\s+)?(it|this)\\s+(might|may|could)"
    - "(please|you\\s+should)\\s+(verify|confirm|check|consult)"
  severity: "info"
  action: "flag"
  description: "LLM09: Low-confidence response detected - verify before acting"

OWASP LLM10 - Model Theft (REQUEST-SIDE)¶

Detects attempts to extract model weights, architecture, or training data.

- name: "model_extraction"
  type: "content_match"
  target: "request"
  patterns:
    - "(extract|dump|export)\\s+(the\\s+)?(model|weights|parameters)"
    - "(what|describe)\\s+(is|are)\\s+your\\s+(weights|parameters|architecture)"
  severity: "warning"
  action: "flag"
  description: "LLM10: Model extraction attempt"

- name: "training_data_extraction"
  type: "content_match"
  target: "request"
  patterns:
    - "(what|which)\\s+(data|dataset|examples)\\s+(were|was)\\s+(you|the\\s+model)\\s+trained\\s+on"
    - "(show|give|list)\\s+me\\s+(examples|samples)\\s+(from|of)\\s+(your|the)\\s+training"
    - "repeat\\s+(exactly|verbatim|word\\s+for\\s+word)"
  severity: "warning"
  action: "flag"
  description: "LLM10: Training data extraction attempt"

- name: "model_replication"
  type: "content_match"
  target: "request"
  patterns:
    - "(create|build|train|replicate)\\s+(a\\s+)?(copy|clone|replica)\\s+of\\s+(you|this\\s+model)"
    - "(distill|compress|extract)\\s+(your|the\\s+model's?)\\s+(knowledge|capabilities)"
    - "knowledge\\s+distillation"
  severity: "warning"
  action: "flag"
  description: "LLM10: Model replication/distillation attempt"

NIST AI RMF - Anomaly Detection¶

General anomaly and abuse detection patterns.

- name: "automated_abuse_pattern"
  type: "content_match"
  patterns:
    - "\\{\\{.*\\}\\}"
    - "\\$\\{.*\\}"
    - "<%.*%>"
  severity: "warning"
  action: "flag"
  description: "NIST: Template/variable injection pattern detected"

- name: "encoding_evasion"
  type: "content_match"
  patterns:
    - "base64\\s+(decode|encode)"
    - "\\\\x[0-9a-fA-F]{2}"
    - "\\\\u[0-9a-fA-F]{4}"
  severity: "warning"
  action: "flag"
  description: "NIST: Encoding/obfuscation pattern detected"

- name: "resource_exhaustion"
  type: "content_match"
  patterns:
    - "(generate|create|write)\\s+(a\\s+)?(very\\s+)?(long|huge|massive|infinite)"
    - "repeat\\s+(this|the\\s+following)\\s+(forever|infinitely|1000000)"
    - "loop\\s+(forever|infinitely)"
  severity: "warning"
  action: "flag"
  description: "LLM04: Potential resource exhaustion attempt"

Tool Call Policy Rules (RESPONSE-SIDE)¶

Block specific tool calls by name or inspect tool arguments for dangerous patterns. These rules evaluate the LLM's response when it decides to call tools, before the response reaches the client.

Supports OpenAI, Anthropic, Mistral, llama.cpp, Ollama, and any provider using OpenAI-compatible or standard tool call formats.

`tool_blocked` — Deny list by tool name¶

Uses glob patterns (via filepath.Match) to match tool/function names.

- name: "block_dangerous_tools"
  type: "tool_blocked"
  target: "response"
  patterns:
    - "exec_*"     # Any tool starting with exec_
    - "shell_*"    # Any tool starting with shell_
    - "rm_*"       # Any tool starting with rm_
    - "sudo_*"     # Any tool starting with sudo_
    - "eval_*"     # Any tool starting with eval_
  severity: "critical"
  action: "block"
  description: "LLM07: Block dangerous tool calls"

`tool_argument_pattern` — Regex match on tool arguments¶

Inspects the JSON-encoded arguments string of tool calls using regex patterns (case-insensitive).

- name: "dangerous_tool_arguments"
  type: "tool_argument_pattern"
  target: "response"
  patterns:
    - "rm\\s+-rf"           # Recursive force delete
    - "chmod\\s+777"        # World-writable permissions
    - "curl.*\\|.*sh"       # Pipe curl output to shell
  severity: "critical"
  action: "terminate"
  description: "LLM08: Dangerous patterns in tool arguments"

Both rule types are included in the standard and strict presets.

Creating Custom Rules¶

You can add your own rules by defining them in the rules section of your config:

policy:
  enabled: true
  preset: standard  # Start with standard rules

  rules:
    # Custom content rule
    - name: "block_competitor_mentions"
      type: "content_match"
      target: "request"
      patterns:
        - "use\\s+(competitor|rival)\\s+instead"
      severity: "warning"
      action: "flag"
      description: "Custom: Competitor mention detected"

    # Custom tool call rule - block specific tools
    - name: "block_internal_tools"
      type: "tool_blocked"
      patterns:
        - "deploy_*"
        - "database_drop"
      severity: "critical"
      action: "block"
      description: "Custom: Block deployment and database tools"

    # Custom argument inspection
    - name: "block_production_args"
      type: "tool_argument_pattern"
      patterns:
        - "production"
        - "prod-db"
      severity: "critical"
      action: "block"
      description: "Custom: Block tool calls targeting production"

Custom rules are appended to preset rules, so you get both.

Rule Type Reference¶

Type	Description	Key Field	Target
`content_match`	Regex match on request/response body	`patterns` (regex)	request, response, both
`content_entropy`	Shannon entropy of content	`threshold_float` (bits/byte)	request, response, both
`tool_blocked`	Deny list by tool name	`patterns` (glob)	response
`tool_argument_pattern`	Regex match on tool arguments	`patterns` (regex)	response
`bytes_in` / `bytes_out` / `bytes_total`	Byte threshold	`threshold`	—
`request_count`	Request count threshold	`threshold`	—
`requests_per_minute`	Rate limit	`threshold`	—
`duration`	Session duration (seconds)	`threshold`	—
`tokens_in` / `tokens_out` / `tokens_total`	Token threshold	`threshold`	—
`tokens_per_minute`	Token rate limit	`threshold`	—
`tool_call_count`	Total tool calls threshold	`threshold`	—
`tool_fanout`	Distinct tools used threshold	`threshold`	—
`rate_anomaly`	Poisson rate anomaly (end-of-session)	`threshold_float` (p-value)	—
`compound_anomaly`	Adaptive CUSUM + entropy (real-time)	`threshold_float` (compound score)	—