fix(firewall): close redaction-leak egress paths and add canary suite (#149, #150, #151, #172, #206)#229
Open
dgenio wants to merge 1 commit into
Open
fix(firewall): close redaction-leak egress paths and add canary suite (#149, #150, #151, #172, #206)#229dgenio wants to merge 1 commit into
dgenio wants to merge 1 commit into
Conversation
Route every Frame/trace egress through the firewall redactor so secrets and PII cannot escape the I-01 boundary via a non-summary path: - #149: redact() fails closed at max_depth — scrub leaf strings, elide nested containers instead of returning them verbatim. - #150: HandleStore.expand() redacts projected rows (inline secrets in grant-permitted fields are scrubbed); expansion Frames carry warnings. - #151: Firewall.apply_stream() holds back a per-field overlap window via a new StreamRedactor so a secret split across chunks is reassembled before emission. - #172: ActionTrace.args (all capabilities) and driver error text pass through the redactor before persistence; memory.* payload stripping unchanged. - #206: new tests/test_firewall_canary.py asserts planted canaries never appear in any egress (summary/table/raw Frames, expansion, streaming, trace args/errors, adapter payloads). Docs (security.md, context_firewall.md) and CHANGELOG updated, including the honest cross-chunk whitespace-pattern limit. https://claude.ai/code/session_01Gq2ooVRbX8rxi5d3dvREN6
There was a problem hiding this comment.
Pull request overview
This PR hardens the context firewall’s redaction guarantees by closing multiple previously-independent egress paths (depth boundary, handle expansion, streaming chunk boundaries, and trace persistence) and adding a canary-based regression suite to enforce the global “no secret escapes” invariant.
Changes:
- Make
redact()fail closed atmax_depth, eliding nested containers while still scrubbing leaf strings. - Add cross-chunk streaming redaction via a stateful
StreamRedactorcarried per top-level field inFirewall.apply_stream(). - Redact sensitive content in handle expansion and in persisted traces (args + error text), and add a comprehensive secret-canary test suite plus doc/changelog updates.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
tests/test_firewall_canary.py |
Adds an end-to-end “canary never appears in any egress” regression suite covering depth, expansion, streaming, traces, and adapter payload rendering. |
tests/test_firewall_boundary.py |
Updates the trace-args test docstring to reflect that args now pass through redaction for all capabilities. |
src/weaver_kernel/kernel/_invoke.py |
Applies redaction to all stored trace args and scrubs driver error text before persisting failure traces. |
src/weaver_kernel/handles.py |
Routes handle expansion previews through redact() and surfaces redaction warnings on the returned Frame. |
src/weaver_kernel/firewall/transform.py |
Adds per-field streaming redaction state (StreamRedactor) and merges stream redaction warnings into emitted Frames. |
src/weaver_kernel/firewall/redaction.py |
Refactors string redaction into _redact_string(), changes max-depth behavior to fail closed, and introduces StreamRedactor. |
docs/security.md |
Documents the newly-covered egress paths and the depth/streaming redaction behaviors and limitations. |
docs/context_firewall.md |
Describes redaction coverage across depth, expansion, streaming, and trace persistence. |
CHANGELOG.md |
Records the security fixes and the added canary regression suite. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
Closes the four context-firewall redaction-leak paths an internal audit identified, and adds the regression net that locks them. One coherent change set, all in the firewall/trace egress area.
src/weaver_kernel/firewall/redaction.pyredact()now fails closed atmax_depth: previously it returned any subtree at/below the depth cap verbatim, so PII/secrets nested beyond the cap reached the LLM unscanned. The boundary now scrubs leaf strings and elides nested containers ([REDACTED: nested data beyond depth limit])._redact_string().StreamRedactor(holdback + overlap window) used for cross-chunk safety.src/weaver_kernel/firewall/transform.py— Add cross-chunk redaction safety for streaming Frames #151:apply_stream()carries a per-fieldStreamRedactor(helper_apply_stream_redactors) so a secret whose characters span two chunks is reassembled and redacted before either half is emitted; stream warnings are surfaced on the Frame.src/weaver_kernel/handles.py— Apply firewall redaction on the handle-expansion path #150:HandleStore.expand()routes its projected rows throughredact()(grant field/scope checks already ran), so an inline secret in a permitted field is scrubbed; expansion Frames now carry redactionwarnings.src/weaver_kernel/kernel/_invoke.py— Extend trace-argument redaction beyondmemory.*capabilities #172:ActionTrace.argsfor every capability (not justmemory.*) and drivererrortext now pass through the redactor before persistence. Memory-payload stripping formemory.*is unchanged.tests/test_firewall_canary.py(new) — Add a secret-canary regression suite covering every Frame egress path #206: plants distinctive canaries and asserts they never appear in any egress — summary/table/raw Frames, handle expansion, streamed chunks, trace args/errors, adapter payloads.tests/test_firewall_boundary.py— updated the now-stale docstring oftest_action_trace_keeps_non_memory_args_verbatimto reflect Extend trace-argument redaction beyondmemory.*capabilities #172 (benign args still pass through; the secret-arg case is pinned in the canary suite).docs/security.md(threat-table rows + honest streaming limit),docs/context_firewall.md(per-path redaction),CHANGELOG.md.Why
The firewall is the I-01 boundary (
docs/agent-context/invariants.md): raw tool output must never reach the LLM unredacted. Redaction was only guaranteed on the firsttransform()summary/table path; expansion, streaming, the depth boundary, and the trace store were independent egress paths that could leak. These four issues were filed together (#206 explicitly frames #149/#150/#151 as the three audit-found leaks), so they share a code area, a bug category, and an implementation path — and are cleanest fixed and regression-tested as one unit.How verified
All commands run in a clean venv (
pip install -e ".[dev]"), mirroringmake ci(fmt-check → lint → type → test → example):ruff format --check src/ tests/ examples/— 96 files already formattedruff check src/ tests/ examples/— All checks passedmypy src/— Success: no issues found in 52 source filespytest -q— 660 passed, 1 skipped (incl. 11 new canary tests)examples/*.py(themake ciexampletarget) — run cleanEach canary test fails on
main(the leak paths) and passes here; the depth/expand/stream/trace cases were confirmed against the pre-fix behavior.Tradeoffs / risks
docs/security.md); contiguous secrets (JWT/Bearer/API-key/connection-string body) are never split across a commit boundary.memory.*capabilities #172 runsredact()over all trace args; benign scalars/values are returned unchanged, so audit value is preserved.Scope notes
Limited to the recommended issue group from the triage (#149, #150, #151, #172, #206). Adjacent firewall items intentionally left as follow-ups: #207 (avoid full-payload serialization in the size hot path — same file, but a perf concern), #174 (summarizer truncation/boolean handling), #178 (operator-configurable redaction patterns), #214 (spotlighting/untrusted-output delimiting).
https://claude.ai/code/session_01Gq2ooVRbX8rxi5d3dvREN6
Generated by Claude Code