Your agent just read a file, hit an API, and got user input — now what?

The Lethal Trifecta is three capability classes landing together in one agent session: untrusted input (web pages, user content, repo text), sensitive access (secrets, files, databases), and external comms (network calls, email, webhooks). Static analysis tells you which agents could form the trifecta based on their tool inventory. Runtime enforcement catches the call that does — and stops it.

Read webpage

untrusted input

Read secrets

sensitive access

Send email

external action blocked

Runtime coloring accumulates session capability classes and blocks the call that would complete the trifecta.

How session coloring works

Every tool call is stamped with one or more capability classes the moment the agent makes it. ActPass accumulates these stamps across the live session. The first read_file on a sensitive path? That's sensitive_access. The earlier fetch to pull a README? untrusted_input. Neither alone is dangerous. But the moment a third call would complete the trifecta — say, a send_email — the enforcement layer already knows what the session has accumulated and blocks before execution.

Rule of Two: any two of the three legs is fine. All three in one session is a stop. The session ends clean; the agent explains why in the systemMessage.

What the hook sees

The ActPass PreToolUse hook receives every tool call before Claude or Codex executes it. It runs a deterministic policy check — no LLM in the decision path — and either passes the call through (exit 0) or blocks it with an explanation the agent can relay to you (exit 2). Session state lives in-process; the hook is a 60-second-cached preflight call to the ActPass API plus local file-based policy eval.

The coloring logic lives entirely in lib/actpass/coloring.ts: classesForToolCall() maps a tool name and arguments to capability classes, mergeClasses() accumulates the session set, and evaluateColoring() decides whether the next call would complete the trifecta or mix red/blue MCP colors. The function is pure — no I/O, fully unit-tested.

Monitor mode first, enforce mode when ready

If you're not ready to block, start with --rule-of-two monitor. The hook passes the call through but injects a warning into the agent's systemMessage: “This session has now combined untrusted input + sensitive access + external comms. You are operating in the Lethal Trifecta. Proceed with extreme caution.” The agent sees it, the user's transcript shows it, and it creates a natural pause for review before switching to enforce.

Switch to --rule-of-two enforce when you want hard stops. In a CI/CD pipeline or automated agent loop, monitor is noise — enforce is the only signal that matters.

MCP color enforcement

Anthropic's MCP color spec marks servers as red (untrusted/user-controlled) or blue (external comms). Mixing both in one session is a prompt injection waiting to happen: a red server feeds poisoned content, and the blue server ships it out. The --block-color-mix flag catches the call that would be the second color in a session and blocks before the data moves.

Zero-friction path to enforcement

The full stack is three steps: run actpass exposure to classify your tool inventory, review the report to confirm which agents hold dangerous combinations, then add the hook to your ~/.claude/settings.json with --rule-of-two enforce. The hook is local, deterministic, and sub-millisecond on cache hit. Your agents don't slow down; they just can't accidentally complete the trifecta. Full setup guide →

Source: Anthropic MCP specification (color annotations); Simon Willison, “The Lethal Trifecta” (2025); Invariant Labs, MCP coloring research.

Read webpage

untrusted input

Read secrets

sensitive access

Send email

external action blocked

Runtime coloring accumulates session capability classes and blocks the call that would complete the trifecta.

How session coloring works

Rule of Two: any two of the three legs is fine. All three in one session is a stop. The session ends clean; the agent explains why in the systemMessage.

What the hook sees

Monitor mode first, enforce mode when ready

Switch to --rule-of-two enforce when you want hard stops. In a CI/CD pipeline or automated agent loop, monitor is noise — enforce is the only signal that matters.

MCP color enforcement

Zero-friction path to enforcement

Source: Anthropic MCP specification (color annotations); Simon Willison, “The Lethal Trifecta” (2025); Invariant Labs, MCP coloring research.

Your agent just read a file, hit an API, and got user input — now what?

How session coloring works

What the hook sees

Monitor mode first, enforce mode when ready

MCP color enforcement

Zero-friction path to enforcement

See your agents' exposure

Keep reading

Your agent just read a file, hit an API, and got user input — now what?

How session coloring works

What the hook sees

Monitor mode first, enforce mode when ready

MCP color enforcement

Zero-friction path to enforcement

See your agents' exposure

Keep reading