grith.aidocs

Filter overview

The 17 filters grith runs on every supervised call, organised into three phases.

Every call your agent makes is run past 17 filters, organised into three phases. A filter's only job is to contribute a number to the composite score. The threshold table at the end of the pipeline routes the call to one of three outcomes:

  • Auto-allow — composite score below proxy.auto_allow_threshold (default 3.0)
  • Quarantine — composite score between the thresholds; routed to the digest queue
  • Auto-deny — composite score above proxy.auto_deny_threshold (default 8.0)

Filters within a phase run in parallel. Phases run in order: static → pattern → context. The budget is roughly <1ms / ~3ms / ~5ms — ~10ms typical, ~15ms worst-case.

The 17 filters

Security filters

# Name Phase Latency budgetScore Summary
1Operation risk scoringStatic<1ms+1 to +3Assigns a baseline risk score based on the call's operation class (file_read, file_write, shell, network, etc).
2Static path matchingStatic<1ms+2 to +5Aho-Corasick scan of paths against curated denylists and allowlists. Hot — runs against every fs operation.
3Sensitive path heuristicStatic<1ms+1 to +4Detects access to known-sensitive files: .env*, id_rsa, id_ed25519, credentials.json, .aws/, .ssh/, .kube/, and similar.
4Allowlist / denylistStatic<1ms-1 to +3User-managed allow and deny rules. Approved entries lower the score; explicit denies raise it. Edited from the digest review UI.
5Argument length & structureStatic<1ms0 to +2Flags suspicious argument shapes: encoded payloads, oversized strings, shell metacharacters in unexpected positions.
6Capability enforcementStatic<1msDENYHard gate: enforces per-profile capability grants. Returns DENY when the active supervisor profile does not permit this operation class.
7Secret / credential scanningPattern~3ms+3 to +5Scans content for embedded credentials. 1,600+ regex patterns covering provider API keys, tokens, private keys, and bespoke credential formats.
8Command structure analysisPattern~3ms+2 to +4Parses shell commands. Catches dangerous patterns: piped curl-to-sh, sudo with arbitrary scripts, fork-bombs, base64-encoded payloads in subshells.
9Egress policyPattern~3ms-1 to +5Network destination policy. Allowlisted hosts lower the score; denylisted or unknown hosts raise it. Plays with the reputation filter for adaptive trust.
10DLP gatePattern~3ms+3 to +5Scans outbound network payloads for sensitive data leaving the box: credentials, PII, large file contents bundled into POST bodies.
11Canary secret detectionPattern~3msDENYHard DENY. Detects registered canary tokens (fake secrets seeded in your repo) reaching any outbound sink — definitive proof of exfiltration.
12Destination reputationContext~5ms-1 to +4Per-destination trust accumulated over time. Established trusted hosts get a discount; suspicious or unfamiliar ones get a markup.
13Behavioural anomalyContext~5ms+1 to +3Compares the current call against the session's historical baseline (default 200-call minimum). Sudden bursts, unusual destinations, or out-of-pattern operations contribute score.
14Taint trackingContext~5ms0 to +5Tracks data flow from sensitive sources (e.g. a .env read) through subsequent operations. Network calls or shell execs after a sensitive read inherit taint.
15Session containmentContext~5ms0 to +5Enforces per-session isolation zones. A session bound to project A cannot reach project B's files or network without explicit cross-zone approval.
16Rate limitingPattern~3msDENYPer-minute caps on file writes (default 30), shell execs (20), and network requests (60). Bursts trigger queue; sustained violations trigger DENY.
17Semantic analysisPattern~3msStub for v0.1. Local embedding-based intent classification. Disabled by default; full implementation planned for v1.5.

17 of 17 filters shown. Within each phase, filters run in parallel; phase order is static → pattern → context.

Reading the table

  • Phase determines when the filter runs. Static filters do cheap structural checks against pre-built indexes. Pattern filters do regex / parser work. Context filters look up state from the running session and historical baseline.
  • Latency budget is the wall-clock target for the phase, not the individual filter. Filters within a phase are concurrent.
  • Score is the inclusive range a filter can contribute. Negative values lower the composite score (i.e. argue for allow). A value of DENY means the filter is a hard gate — when it fires, the call is denied regardless of any other filter.

What's not visible in the table

  • The scoring engine doesn't sum scores naively. It applies weights and ceilings from [reputation] (e.g. ceiling_filter_threshold = 5.0 for capped contributions). See Composite scoring.
  • The reputation system can subtract from the composite when the destination and call shape have been observed-and-approved many times — see Adaptive reputation.
  • Canary detection and capability enforcement are hard gates; they don't contribute score, they short-circuit to DENY.

See also

Last updated: 2026-05-14Edit this page on GitHub →
© 2026 grith. All rights reserved.