Skip to content

fix(hookify): replace hand-rolled YAML parser + fix new_text field on Write#54873

Open
adelaidasofia wants to merge 1 commit intoanthropics:mainfrom
adelaidasofia:fix/hookify-yaml-parser-and-new-text-field
Open

fix(hookify): replace hand-rolled YAML parser + fix new_text field on Write#54873
adelaidasofia wants to merge 1 commit intoanthropics:mainfrom
adelaidasofia:fix/hookify-yaml-parser-and-new-text-field

Conversation

@adelaidasofia
Copy link
Copy Markdown

Summary

Two silent bugs discovered via a full regression harness (39 test cases across all hookify rules):

Bug 1 — config_loader.py: hand-rolled YAML parser double-escapes backslashes

The custom parser in extract_frontmatter stripped surrounding quotes but did not process YAML escape sequences. In double-quoted YAML, \\d means one backslash + d (i.e. \d, a regex digit class). The old parser stored the raw characters between the quotes — two backslashes + d — so the compiled regex pattern matched a literal \d string instead of a digit.

Impact: Any hookify rule using \d, \w, \s, \b, or \[ in a double-quoted pattern: value silently never fired. The rule loaded without error; the condition just always returned False.

Fix: Replace the 85-line hand-rolled parser with yaml.safe_load + a YAMLError guard. Four lines.

Side-effect / authoring note: Unquoted YAML patterns starting with [ (e.g. pattern: [A-Z].*) are now correctly rejected as YAML flow sequences — this was always invalid YAML that the old parser accidentally tolerated. Rule authors must single-quote such patterns: pattern: '[A-Z].*'. This is standard YAML practice.

Bug 2 — rule_engine.py: new_text field on Write returns empty string

_extract_field('new_text', 'Write', tool_input) called tool_input.get('new_string', ''). The Write tool stores its content under content, not new_string (that's Edit). So any rule with field: new_text on a file event silently never fired when the tool was Write.

Fix: One-line change — fall back to content when new_string is absent, matching the existing behaviour of field: content.

Test plan

  • Regression harness: 39 test cases covering positive + negative for every active rule — 39/39 pass after these fixes (was 36/39 before)
  • session-close-sonnet-check (uses \d{8} pattern) — was silently broken, now fires correctly
  • warn-exclamation-marks (uses [A-Z] pattern, single-quoted after authoring fix) — fires correctly
  • block-em-dash, warn-external-prose-humanizer — verified with Write tool using field: new_text
  • No regressions on rules that were already working

🤖 Generated with Claude Code

…on Write

Two bugs discovered via a full regression harness (39 test cases):

1. config_loader: Hand-rolled YAML parser double-escaped backslashes
   Pattern `"\\d{8}"` in double-quoted YAML should load as `\d{8}` (regex
   digit). The custom parser stored it as `\\d{8}` (literal backslash+d),
   silently breaking any rule that used `\d`, `\w`, `\s`, `\b`, or `\[`
   in its pattern. Replaced with `yaml.safe_load` + a YAMLError guard.

   Side-effect: unquoted patterns starting with `[` or `\` (e.g.
   `pattern: [A-Z].*` or `pattern: \[owner::]`) now correctly raise a
   YAMLError. Rule authors must single-quote such patterns, which is
   standard YAML practice and is now documented in the authoring guide.

2. rule_engine: `new_text` field on Write tool returned empty string
   `_extract_field('new_text', 'Write', tool_input)` called
   `tool_input.get('new_string', '')` — but Write uses `content`, not
   `new_string`. Any rule using `field: new_text` on a Write event
   silently never fired. Fixed to fall back to `content` when `new_string`
   is absent, matching the existing behaviour of `field: content`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant