Filter overview

The 18 filters grith runs on every supervised call, organised into three phases.

Every call your agent makes is run past 18 filters, organised into three phases. A filter's only job is to contribute a number to the composite score. The threshold table at the end of the pipeline routes the call to one of three outcomes:

Auto-allow — composite score below proxy.auto_allow_threshold (default 3.0)
Quarantine — composite score between the thresholds; routed to the digest queue
Auto-deny — composite score above proxy.auto_deny_threshold (default 8.0)

Filters within a phase run in parallel. Phases run in order: static → pattern → context. The budget is roughly <1ms / ~3ms / ~5ms — ~10ms typical, ~15ms worst-case.

The 18 filters

Security filters

#	Name	Phase	Latency budget	Score	Summary
1	Operation risk scoring	Static	<1ms	+1 to +3	Assigns a baseline risk score based on the call's operation class (file_read, file_write, shell, network, etc).
2	Static path matching	Static	<1ms	+2 to +5	Aho-Corasick scan of paths against curated denylists and allowlists. Hot — runs against every fs operation.
3	Sensitive path heuristic	Static	<1ms	+1 to +4	Detects access to known-sensitive files: .env*, id_rsa, id_ed25519, credentials.json, .aws/, .ssh/, .kube/, and similar.
4	Allowlist / denylist	Static	<1ms	-1 to +3	User-managed allow and deny rules. Approved entries lower the score; explicit denies raise it. Edited from the digest review UI.
5	Argument length & structure	Static	<1ms	0 to +2	Flags suspicious argument shapes: encoded payloads, oversized strings, shell metacharacters in unexpected positions.
6	Capability enforcement	Static	<1ms	DENY	Hard gate: enforces per-profile capability grants. Returns DENY when the active supervisor profile does not permit this operation class.
7	Secret / credential scanning	Pattern	~3ms	+3 to +5	Scans content for embedded credentials. 1,620+ regex patterns covering provider API keys, tokens, private keys, and bespoke credential formats.
8	Command structure analysis	Pattern	~3ms	+2 to +4	Parses shell commands. Catches dangerous patterns: piped curl-to-sh, sudo with arbitrary scripts, fork-bombs, base64-encoded payloads in subshells.
9	Destructive action coverage	Pattern	~3ms	DENY	Hard-denies catastrophic, irreversible host/storage destruction (filesystem format, raw block-device overwrite, recursive removal of a system root or database data directory) and escalates destructive operations directed at production (managed-DB endpoints, prod/live-tagged resources) to deny; other destructive operations queue for review. Also scores docker/podman run host-escalation.
10	Egress policy	Pattern	~3ms	-1 to +5	Network destination policy. Allowlisted hosts lower the score; denylisted or unknown hosts raise it. Plays with the reputation filter for adaptive trust.
11	DLP gate	Pattern	~3ms	+3 to +5	Scans outbound network payloads for sensitive data leaving the box: credentials, PII, large file contents bundled into POST bodies.
12	Canary secret detection	Pattern	~3ms	DENY	Hard DENY. Detects registered canary tokens (fake secrets seeded in your repo) reaching any outbound sink — definitive proof of exfiltration.
13	Destination reputation	Context	~5ms	-1 to +4	Per-destination trust accumulated over time. Established trusted hosts get a discount; suspicious or unfamiliar ones get a markup.
14	Behavioural anomaly	Context	~5ms	+1 to +3	Compares the current call against the session's historical baseline (default 200-call minimum). Sudden bursts, unusual destinations, or out-of-pattern operations contribute score.
15	Taint tracking	Context	~5ms	0 to +5	Tracks data flow from sensitive sources (e.g. a .env read) through subsequent operations. Network calls or shell execs that reference tainted data inherit taint.
16	Session containment	Context	~5ms	0 to +5	Enforces per-session isolation zones. A session bound to project A cannot reach project B's files or network without explicit cross-zone approval.
17	Rate limiting	Context	~5ms	DENY	Per-minute caps on file writes (default 30), shell execs (20), and network requests (60). Bursts trigger queue; sustained violations trigger DENY. Volume penalties are risk-gated by default.
18	Egress rate / burst detection	Context	~5ms	+1 to +3	Per-destination outbound rate and burst detection. Flags rapid bursts of network egress to the same or many destinations — a beaconing or exfil shape that single-call filters miss.

18 of 18 filters shown. Within each phase, filters run in parallel; phase order is static → pattern → context.

Reading the table

Phase determines when the filter runs. Static filters do cheap structural checks against pre-built indexes. Pattern filters do regex / parser work. Context filters look up state from the running session and historical baseline.
Latency budget is the wall-clock target for the phase, not the individual filter. Filters within a phase are concurrent.
Score is the inclusive range a filter can contribute. Negative values lower the composite score (i.e. argue for allow). A value of DENY means the filter is a hard gate — when it fires, the call is denied regardless of any other filter.

What's not visible in the table

The scoring engine doesn't sum scores naively. It applies weights and ceilings from [reputation] (e.g. ceiling_filter_threshold = 5.0 for capped contributions). See Composite scoring.
The reputation system can subtract from the composite when the destination and call shape have been observed-and-approved many times — see Adaptive reputation.
Canary detection and capability enforcement are hard gates; they don't contribute score, they short-circuit to DENY.

Filter overview

The 18 filters

Security filters

Reading the table

What's not visible in the table

See also