Threat model
Formal threat model — what grith defends, what it doesn't, and where the trust boundaries sit.
This is the formal threat model. For the high-level conceptual overview, see The threat model (concept). This page is the reference for security reviewers.
Asset categorisation
| Asset | Description | Sensitivity |
|---|---|---|
| Credentials at rest | API keys, SSH keys, cloud creds in user dotfiles | Critical |
| Source code | Project files | High |
| Operational data | Customer data, internal docs | High to critical |
| Operating system state | Files, processes, registry-equivalents | Medium-high |
| Network reachability | Outbound destinations and reachability state | Medium |
| Reputation table | Accumulated trust signals | Low (leak is uninteresting; tamper is medium) |
| Audit log | Record of decisions made | Medium (privacy: reveals usage patterns) |
Adversary types
A1: Prompt-injection attacker
- Capability: controls some text the agent will read (README, search result, comment, error message).
- Goal: exfiltrate credentials, deploy malicious code, pivot to other resources.
- Threat: high. Most realistic attack today.
A2: Compromised dependency
- Capability: controls a package or tool the agent will run.
- Goal: as A1 plus persistence.
- Threat: high.
A3: Compromised model
- Capability: malicious model weights, MITM'd inference endpoint, or malicious provider.
- Goal: as A1.
- Threat: medium. Requires significant resources.
A4: Confused agent
- Capability: none — agent is non-malicious but mistaken.
- Goal: none. Mistakes happen.
- Threat: medium. Often as costly as actual attacks.
A5: Local userland attacker
- Capability: runs as the grith user.
- Goal: bypass grith.
- Threat: out of scope. Userland trust boundary is the kernel's job.
A6: Local root attacker
- Capability: root.
- Goal: anything.
- Threat: out of scope.
A7: Kernel-level attacker
- Capability: kernel exploit.
- Goal: anything.
- Threat: out of scope.
A8: Network-layer attacker
- Capability: MITM between grith and grith.ai, or between grith and the agent's provider.
- Goal: license tamper / model MITM / exfil eavesdrop.
- Threat: low (TLS + cert pinning). Out of scope for non-TLS variants.
Filter coverage matrix
| Attack class | Primary defence | Secondary | Notes |
|---|---|---|---|
| Read credential, send out | sensitive_path, taint, dlp, secret_scan | reputation, canary | Canary catches when others fuzz. |
| Read credential, leak via summary | dlp, secret_scan | egress, reputation | Model summary can defeat taint; dlp catches if content shape persists. |
| Pivot to other project | path_match, session_containment | sensitive_path | Zone enforcement. |
| Privilege escalation | capability, command | rate_limit | Hard gate via capability. |
| Lateral via shell | command, egress | rate_limit, behavioural | Command parser. |
| Persistent backdoor (file write) | path_match, sensitive_path | behavioural | Writes outside project. |
| Beaconing / C2 | egress, reputation | behavioural, rate_limit | Adaptive trust. |
| Confused agent destructive | operation_risk, sensitive_path | path_match | Score lifts on rm-shaped. |
Defence-in-depth properties
- Composition: every call routes through every applicable filter. No single-filter failure compromises the whole pipeline.
- Determinism: filters don't depend on the model's behaviour, only on the syscall and session state.
- Hard gates: canary detection and capability enforcement bypass scoring to deny unconditionally.
- Cold start: new installs widen the queue zone to avoid pathological auto-allow on uncalibrated state.
What grith doesn't promise
- Bypass impossibility: a determined attacker with sufficient cleverness in prompt construction may find a sequence of calls that, individually, all auto-allow but composed effect a leak. Mitigations: canary, behavioural, cross-session analytics (Pro).
- Future-proof signatures: secret_scan and dlp_gate have curated pattern sets. New credential formats won't be detected until added.
- Side-channel defence: timing-based DNS exfil, slow-and-low patterns — partially covered by behavioural, not formally guaranteed.
See also
Last updated: 2026-05-14Edit this page on GitHub →