grith.aidocs

Adaptive reputation

How grith gets quieter over time by learning which call shapes you trust.

A grith install on day one is loud. The filters know what looks suspicious but not what's typical for your setup. A week in, you've approved a hundred calls that the filters scored as ambiguous. The reputation system folds those approvals into a per-shape trust score that quietens the digest going forward.

What "reputation" tracks

The trust table keys are tuples of:

  • Operation (file_read, network, shell, …)
  • Destination shape (path prefix, domain, command head)
  • Profile context (which profile was active)

For each key, the table holds:

  • observations — how many times this shape has been seen.
  • denials — how many times it was denied (manually or auto).
  • trust — a smoothed score in [0, 1], roughly "how often this shape was approved".
  • last_seen — for decay.

A new call's filter score gets adjusted by the trust value: if trust ≥ reputation.auto_allow_trust (default 0.92) and observations ≥ reputation.auto_allow_min_observations (default 8), the composite gets a reduction up to reputation.max_score_reduction (default 4.0).

The cap matters. A high-trust shape can still escape into the digest if a different filter (DLP, canary, taint) lights it up. Trust is a discount, not a bypass.

How trust changes

  • Approve in the digest → trust nudges up.
  • Approve + learn → trust jumps faster; this is the deliberate "yes, please remember this" signal.
  • Deny → trust drops, weighted by reputation.deny_weight (default 3.0, meaning denials are worth 3× as much as approvals in the smoothing).
  • Auto-allow (the call was already auto-allowed by score) → trust nudges up very slightly. The system shouldn't be calibrated by easy approvals.
  • Auto-deny → trust drops.
  • Time → trust decays toward neutral via reputation.decay_lambda (default 0.98, applied per save cycle, default 5 minutes). Trust earned six months ago matters less than trust earned yesterday.

Why decay

Without decay, a one-time bad approval would mark a destination "trusted" forever. Decay lets old signals fade so the trust table reflects recent behaviour, not your worst day from a year ago. The default 0.98 per save cycle works out to roughly half-life of 30 days under typical usage; tighter agents (more frequent calls) decay faster in elapsed time.

What it can't do

The reputation system cannot promote a call past a hard gate:

  • A canary detection denies, full stop, regardless of trust.
  • A capability deny is enforced regardless of trust.
  • The max_score_reduction cap ensures even a perfectly-trusted shape can't push the composite below 0 in the presence of other strong contributions.

The reputation system also doesn't share state between sessions started by different users (in multi-user team setups) by default — see Team sync for the Pro flow that does.

Adversarial robustness

A natural worry: what if the agent (or attacker controlling it) does a hundred benign reads to build up reputation, then sneaks one bad call through? Two mitigations:

  1. Only human approvals move the needle fast. Auto-allow contributions are intentionally tiny (reputation.auto_allow_trust_increment ≈ 0.001). An attacker would need years of pure auto-allow to flip a shape from neutral to fully trusted.
  2. Strong filters bypass reputation. Even a fully-trusted destination gets the secret scan, DLP gate, canary detection, and taint inheritance done on every call. Reputation only knocks the composite down; it doesn't make the filter blind.

The system is designed so that the only way to teach grith something significantly wrong is to keep explicitly approving wrong things in the digest. Which is a different problem.

Inspecting the trust table

grith reputation show

Lists shapes with their trust, observation count, and last-seen time. Sorted by recency by default; --sort trust shows your most-trusted shapes (sometimes surprising).

grith reputation reset

Wipes the table. Useful when starting fresh — for example, after switching agents or after a security audit.

Persistence

The trust table is held in memory and flushed to ~/.cache/grith/reputation/<profile>.bin every reputation.save_interval_seconds (default 300s). Atomic write-rename, so a crash mid-flush doesn't corrupt. The file is binary CBOR; safe to delete if it gets corrupted (you'll lose accumulated trust, but nothing in the audit log or pending digest).

See also

Last updated: 2026-05-14Edit this page on GitHub →
© 2026 grith. All rights reserved.