11. Canary detection

Hard gate. Registered canary tokens leaving the box trigger an immediate DENY.


Phase	Pattern
Score range	DENY (hard)
Module	`crates/grith-proxy/src/filters/canary.rs`
Config file	`config/filters/canary.toml`

The canary filter is a hard gate, like capability enforcement. If a registered canary token appears in an outbound payload, the call is denied unconditionally — no score, no queue, no reputation discount.

See Canary tokens for the concept and Setting up canary tokens for the practical deployment guide.

What it does

For every outbound payload (network send, shell argument that could leak, file write outside the zone), check whether any registered canary value appears in the bytes. If yes:

Return DENY immediately.
Tag the audit record with the canary's id and label.
Trigger high-severity notifications (escapes channel rate limits).
Optionally trigger a session terminate (configurable per-canary).

The check is fast — a single Aho-Corasick scan against the registered token set, which is typically a few dozen entries at most.

Why a hard gate

The whole filter pipeline does probabilistic scoring. Canary detection is the non-probabilistic signal grith has:

The canary value is something a real human deliberately placed.
It has no legitimate reason to exist outside the file it was placed in.
Seeing it in an outbound payload is direct, irrefutable evidence of exfil.

There is no amount of trust, scoring, or context that should make canary detection soften. So it doesn't.

What gets scanned

All bytes of outbound network sends.
Shell command argument strings.
File write payloads when the destination is outside the session zone.
Some indirect channels (DNS query names) when DNS inspection is enabled.

The scanner runs after secret_scan because the work is similar and the canary set is a subset of "potentially-exfilled values" — but its decision power is much greater.

Registering canaries

Canaries are registered via grith canary add — see grith canary. The CLI both generates the token and writes it to the placement path, and registers the exact string in the canary detection set.

When a canary fires

A canary fire is a security event. The default notification severity is critical — pages PagerDuty / Opsgenie even outside normal rate-limit budgets.

The audit record captures:

Canary id + label + format.
Session, profile, originating PID, command line.
Destination of the attempted send.
The full filter context for any other filters that also fired.