Skip to content

fix(hooks): stop hook field rename + debug tracing system#228

Open
dean0x wants to merge 11 commits into
mainfrom
fix/stop-hook-field-rename
Open

fix(hooks): stop hook field rename + debug tracing system#228
dean0x wants to merge 11 commits into
mainfrom
fix/stop-hook-field-rename

Conversation

@dean0x
Copy link
Copy Markdown
Owner

@dean0x dean0x commented May 27, 2026

Summary

  • Stop hook fix: response_textlast_assistant_message field rename, removes dead stop_reason filter. The Claude Code Stop hook input format changed (mid-May 2026) which silently broke working memory capture — queue accumulated 1,640+ user-only turns with zero assistant turns.
  • Debug tracing system: Adds consistent DEVFLOW_HOOK_DEBUG=1 debug tracing to all 7 hooks via new scripts/hooks/debug-trace shared helper. Two-phase logging: global fallback pre-CWD, per-project log post-CWD.
  • CLI command: devflow debug --enable/--disable/--status toggles DEVFLOW_HOOK_DEBUG in ~/.claude/settings.json env block.
  • Normal logging: 4 hooks gain standard file logging (sidecar-dispatch, session-start-memory, session-start-context, pre-compact-memory).

Key design decisions

  • dbg() { :; } no-op defined before set -e in every hook — missing debug-trace file never causes hook failure
  • Debug functions short-circuit immediately when DEVFLOW_HOOK_DEBUG!=1 — zero overhead in normal operation (no subshells, no mkdir)
  • jq INPUT keys subshell guarded behind debug check to avoid unnecessary process spawn on every stop hook

Files changed

File Change
scripts/hooks/debug-trace New — shared debug helper
scripts/hooks/sidecar-capture Field rename + replace ad-hoc /tmp/ debug with standardized system
scripts/hooks/sidecar-dispatch Add debug-trace + normal logging
scripts/hooks/sidecar-evaluate Add debug-trace layer
scripts/hooks/session-start-memory Add debug-trace + normal logging
scripts/hooks/session-start-context Add debug-trace + normal logging
scripts/hooks/pre-compact-memory Extract SCRIPT_DIR, add debug-trace + normal logging
scripts/hooks/preamble Add debug-trace (no normal logging)
src/cli/commands/debug.ts New — CLI toggle command
src/cli/cli.ts Register debug command
tests/shell-hooks.test.ts Update field names, add debug-trace syntax check
tests/sentinel.test.ts Update field names

Test plan

  • Build passes (1571 tests, 62 test files)
  • All hooks source debug-trace with safe fallback pattern
  • devflow debug --enable → verify DEVFLOW_HOOK_DEBUG in settings.json
  • devflow debug --disable → verify removal from settings.json
  • With debug ON: run a session, check ~/.devflow/logs/{slug}/.hook-debug.log has entries from multiple hooks
  • With debug OFF: verify .hook-debug.log is not written to
  • Temporarily rename debug-trace → hooks still work (no-op fallback)

Dean Sharon added 2 commits May 27, 2026 16:18
…nges: 1. Fix stop hook field rename (response_text -> last_assistant_message) that broke working memory capture across all projects. Also removes the stop_reason filter (no longer needed -- last_assistant_message is only populated on real assistant turns). Updates all tests to use the new field name. 2. Add consistent DEVFLOW_HOOK_DEBUG=1 debug tracing to all 7 hooks via a new shared scripts/hooks/debug-trace helper. Each hook now has a safe dbg() no-op fallback before set -e so missing debug-trace never causes hook failure. Phase 1 traces to ~/.devflow/logs/.hook-debug.log (pre-CWD global fallback), Phase 2 switches to the per-project log. Adds normal file logging to sidecar-dispatch, session-start-memory, session-start-context, and pre-compact-memory. New devflow debug --enable/--disable/--status CLI command toggles DEVFLOW_HOOK_DEBUG in ~/.claude/settings.json env block. Co-Authored-By: Claude <[email protected]>
…LOW_HOOK_DEBUG!=1 to avoid unnecessary mkdir syscalls on every hook invocation - sidecar-capture: guard jq INPUT keys subshell behind debug check so it only runs when tracing is active; remove duplicate dbg log - sidecar-dispatch: remove duplicate CWD debug log (already logged with PROMPT_LENGTH on earlier line) - shell-hooks.test.ts: add debug-trace to HOOK_SCRIPTS syntax check list
@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

scripts/hooks/sidecar-capture (Line 61)

The log() function here uses bare >> "$LOG_FILE" without error suppression. All four newly-added log() definitions in pre-compact-memory:44, session-start-memory:41, session-start-context:50, and sidecar-dispatch:48 use 2>/dev/null || true. Under set -e, a write failure would abort the hook, contradicting the stated contract.

Suggested fix:

log() { echo "[\2026-05-27T14:37:14Z] [sidecar-capture] \$1" >> "\$LOG_FILE" 2>/dev/null || true; }

Review finding with 82% confidence (MEDIUM, reliability + security)

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

scripts/hooks/sidecar-capture (Line 8)

The debug initialization preamble is duplicated across all 7 hooks. Each includes nearly identical:

  • dbg() { :; } no-op definition
  • source debug-trace
  • devflow_debug_init
  • dbg "=== HOOK START ==="

This pattern appears in sidecar-capture:8-18, sidecar-dispatch:8-22, sidecar-evaluate:7-21, session-start-memory:7-17, session-start-context:11-22, pre-compact-memory:10-19, and preamble:7-17.

Consider extracting into a shared helper (e.g., init-hook) that each hook calls with just the hook name. This would reduce the repeated ceremony and make future hook additions simpler.


Review finding with 85% confidence (HIGH, complexity)

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

scripts/hooks/debug-trace (Line 29)

Debug log directories are created with mkdir -p but without restrictive permissions. Session metadata (session IDs, CWD paths, feature states) could be readable by other users on shared systems. The log-paths helper sets chmod 700 after mkdir — debug-trace should match.

Suggested fix:

mkdir -p "\$global_log_dir" 2>/dev/null || true
chmod 700 "\$global_log_dir" 2>/dev/null || true

Also add the same fix at line 42-43 for the Phase 2 per-project log directory.


Review finding with 85% confidence (MEDIUM, security)

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

scripts/hooks/sidecar-dispatch (Line 149)

Unguarded command substitution $(...) in dbg argument forks subprocesses even when debug is disabled. This occurs in 4 places across the PR:

  • sidecar-dispatch:149, 157
  • session-start-context:142
  • sidecar-evaluate:463

These should match the pattern in sidecar-capture:33, which guards with if [ "\${DEVFLOW_HOOK_DEBUG:-}" = "1" ]; then.

Fix example:

if [ "\${DEVFLOW_HOOK_DEBUG:-}" = "1" ]; then
  dbg "Learned artifacts: commands=$(echo "\$LEARNED_COMMANDS" | grep -c , 2>/dev/null || echo 0)"
fi

Review finding with 92% confidence (MEDIUM, performance)

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

scripts/hooks/preamble (Line 44)

Missing dbg "=== HOOK COMPLETE ===" at the end. All six other hooks that received debug tracing end with this bookend marker. The hook's control flow ends inside the if/else block (line 44), so add it after:

if [[ "\$PROMPT" == *"## Goal"* ]] && [[ "\$PROMPT" == *"## Steps"* ]] && [[ "\$PROMPT" == *"## Files"* ]]; then
  dbg "EXECUTION_PLAN detected — injecting directive"
  json_prompt_output "EXECUTION_PLAN detected. Invoke \`devflow:implement\` via the Skill tool to execute this plan."
else
  dbg "No plan markers detected — no output"
fi

dbg "=== HOOK COMPLETE ==="

Review finding with 90% confidence (MEDIUM, consistency)

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

scripts/hooks/sidecar-capture (Line 59)

Inconsistent error handling for log-paths sourcing. Two patterns exist in this PR:

Pre-existing hooks (sidecar-capture:59, sidecar-evaluate:56): hard-fail

source "\$SCRIPT_DIR/log-paths" || { echo "..."; exit 1; }

Newly-added logging (pre-compact-memory:41, session-start-memory:38, session-start-context:47, sidecar-dispatch:45): soft-fail

source "\$SCRIPT_DIR/log-paths" || true

The soft-fail || true is more appropriate for optional logging — logging failure should not kill the hook. Recommend updating sidecar-capture and sidecar-evaluate to match the new hooks for consistency across the whole hook system.


Review finding with 85% confidence (HIGH, consistency)

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

scripts/hooks/sidecar-evaluate (Line 496)

File size now 496 lines (was 465 on main), approaching the 500-line critical threshold. The reinforcement section (lines 167-241) has nesting up to 7 levels with 15+ lines of inline jq and equivalent node fallback length. This is a pre-existing structural issue, but the new debug instrumentation (31 new dbg lines) pushes it closer to critical.

Consider a follow-up refactor: Extract the 4 major evaluation sections into sourced helpers:

  • sidecar-eval-learning
  • sidecar-eval-decisions
  • sidecar-eval-knowledge
  • sidecar-eval-reinforce

This reduces the main script to a dispatcher. Applies PF-004 thinking — avoids migration complexity by preserving behavior while improving file organization.


Review finding with 90% confidence (MEDIUM, complexity)

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

src/cli/commands/debug.ts (Line 27)

JSON.parse(raw) can throw on malformed JSON. The catch block silently falls back to {}, conflating "file not found" with "corrupted JSON". If settings.json exists but is corrupted, the command will silently overwrite it with a minimal object, destroying existing settings.

Suggested fix — differentiate error types:

try {
  const raw = await fs.readFile(settingsPath, 'utf-8');
  settings = JSON.parse(raw) as Record<string, unknown>;
} catch (err: unknown) {
  if (err instanceof SyntaxError) {
    p.log.error('settings.json is malformed — fix it before modifying env vars');
    return;
  }
  // ENOENT or other fs error — safe to start fresh
  settings = {};
}

Review finding with 85% confidence (HIGH, TypeScript)

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

src/cli/commands/debug.ts (Line 32)

Unsafe type assertion: settings.env cast to Record<string, string> | undefined without runtime validation. If settings.json contains "env": 42 or "env": "string", subsequent property access (env.DEVFLOW_HOOK_DEBUG, delete env.DEVFLOW_HOOK_DEBUG) will fail silently or produce unexpected results.

Add runtime type guard:

const rawEnv = settings.env;
const env: Record<string, string> =
  (typeof rawEnv === 'object' && rawEnv !== null && !Array.isArray(rawEnv))
    ? (rawEnv as Record<string, string>)
    : {};

Note: Applies ADR-001 (clean break philosophy) — no backward compatibility needed, just do it right from the start.


Review finding with 82% confidence (MEDIUM, TypeScript)

@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

Review Summary

Overall: APPROVED_WITH_CONDITIONS (field rename is solid, debug system well-designed, actionable improvements needed)

High-Confidence Findings (≥80%) — 9 Comments Posted

All critical issues are addressed in individual comments above. Priority fixes:

  1. Error suppression (sidecar-capture:61 + debug-trace:29) — reliability + security
  2. Unguarded $() in dbg calls (4 locations) — performance
  3. log-paths error handling consistency (sidecar-capture:59, sidecar-evaluate:56) — consistency
  4. TypeScript validation (debug.ts:27, debug.ts:32) — type safety

Lower-Confidence Suggestions (60-79%)

These go in summary (not as inline comments):

  • Debug log unbounded growth (debug-trace) — 65% — When DEVFLOW_HOOK_DEBUG=1 is active, logs grow without rotation. Add size check in devflow_debug_init() to truncate at 5MB threshold, keeping tail (similar to production log rotation).

  • Dual logging cognitive load (all 7 hooks) — 82% — Every hook now has both dbg() (debug-only) and log() (always-on) with near-identical messages at many call sites. Consider having log() proxy to dbg() with same message, so only one call is needed.

  • Internal variable RESPONSE_TEXT (sidecar-capture:46) — 65% — Field renamed from response_text to last_assistant_message but internal bash variable is still RESPONSE_TEXT. Renaming would improve traceability, though internal names are not contractual.

  • devflow_debug_set_cwd timing (sidecar-dispatch:34) — 70% — dbg "CWD=..." fires before devflow_debug_set_cwd (line 38), so goes to global fallback log. All other hooks log CWD after setting it. Minor ordering divergence.

Test Coverage Gaps (HIGH priority for follow-up)

Per the Testing reviewer:

  • No behavioral tests for debug-trace (90% confidence) — Shell helper is sourced by all 7 hooks but has only syntax checks. Add tests for: no-op when disabled, global log when enabled, per-project log after CWD set.
  • No tests for debug.ts CLI (92% confidence) — Settings mutations need coverage. Create tests/debug.test.ts with: --enable/--disable/--status behavior, env var handling, graceful failure modes.
  • No regression test for stop_reason removal (85% confidence) — Verify that stop_reason: "tool_use" WITH non-empty last_assistant_message now proceeds to capture (old behavior would have skipped).

Field Rename Validation

✅ Complete and correct:

  • All 19 test input objects updated from { stop_reason, response_text } to { last_assistant_message }
  • Zero remaining references to old field names in codebase
  • jq and node extraction paths both correct
  • cut field index correctly adjusted (field 3 → field 2)
  • All 1571 tests pass

Design Notes

  • Applies ADR-001 (clean break): No backward-compatibility layer for old fields — correct decision
  • debug-trace architecture: Clean two-phase logging (global then per-project), graceful no-op fallback, proper initialization ordering in most hooks
  • Avoids PF-003 (migration complexity): Field rename is in hooks installed at init time; users must run devflow init to pick up changes — existing pattern

Next Steps

  1. Apply the 9 high-confidence inline fixes (likely <30 mins)
  2. Add test coverage for debug-trace and debug.ts (consider for same PR or quick follow-up)
  3. Evaluate lower-confidence suggestions (log rotation, dual-logging pattern) for follow-up PRs

Generated by devflow:git comment-pr operation

Dean Sharon and others added 4 commits May 27, 2026 17:54
- Catch SyntaxError from JSON.parse separately to prevent silently
  overwriting a malformed settings.json with {}
- Replace unsafe cast of settings.env with a runtime type guard
  that falls back to {} for non-object values

Co-Authored-By: Claude <[email protected]>
…- debug-trace: add chmod 700 after mkdir in both Phase 1 and Phase 2 to match log-paths pattern and restrict debug log directory permissions - debug-trace: add 5MB size guard to prevent unbounded log growth; truncates to tail 2.5MB using PID-unique temp file for atomicity - sidecar-capture: add 2>/dev/null || true to log() to prevent set -e abort on log write failures - sidecar-capture/sidecar-evaluate: change log-paths sourcing from hard-fail to soft-fail matching the 4 new hooks; use two-step LOG_DIR/LOG_FILE pattern with /tmp fallback for consistency and defensiveness - sidecar-evaluate: remove now-redundant mkdir -p on LOG_FILE dirname - session-start-context, sidecar-dispatch, sidecar-evaluate: guard unguarded subshell calls in dbg arguments behind DEVFLOW_HOOK_DEBUG=1 check to eliminate unnecessary subprocess forks when debug is OFF - preamble: add missing dbg HOOK COMPLETE bookend for consistency
…n regression

- tests/shell-hooks.test.ts: add 4 debug-trace behavioral tests verifying
  that dbg() is a no-op when DEVFLOW_HOOK_DEBUG is unset, writes to global
  log when enabled, and devflow_debug_set_cwd switches to per-project log
- tests/sentinel.test.ts: add stop_reason=tool_use regression test confirming
  capture proceeds when last_assistant_message is present regardless of
  stop_reason value; add log file creation test for sidecar-capture
- tests/debug.test.ts: new test file covering devflow debug --enable/--disable/
  --status/malformed settings.json — the only CLI command previously without tests

Co-Authored-By: Claude <[email protected]>
- debug-trace: harden tail+mv sequence with explicit conditional to ensure temp file cleanup on tail failure
- sidecar-dispatch: remove duplicate PROMPT_LENGTH debug log (already logged on prior line)
- debug.test.ts: remove unused import (path module)

Co-Authored-By: Claude <[email protected]>
@dean0x
Copy link
Copy Markdown
Owner Author

dean0x commented May 27, 2026

Review Summary: CHANGES_REQUESTED

Comprehensive review across 9 dimensions identified 4 blocking architectural/reliability issues and 2 consistency violations requiring fixes before merge. All test coverage, regression, and security hygiene are solid. Field rename migration is complete with zero stale references.

Blocking Issues

1. Missing 5MB Size Guard on Phase 2 Debug Log (Reliability + Security)

File: scripts/hooks/debug-trace:46-59
Confidence: 92% (Reliability), 85% (Security)

The global debug log (Phase 1) has a 5MB truncation guard, but the per-project log (Phase 2) does not. This reintroduces the unbounded growth risk on the primary logging path (Phase 2 fires on 80% of hook invocations). Every 100 prompt cycles with debug enabled will create ~100-150MB of debug logs with no cleanup.

Fix: Copy the exact 5MB truncation pattern into devflow_debug_set_cwd() after setting _DEVFLOW_DBG_LOG.


2. Feedback-Loop Guard Inconsistency (Consistency)

File: scripts/hooks/sidecar-dispatch:14-16 and sidecar-evaluate:13-15
Confidence: 88% (Consistency)

sidecar-capture moved its DEVFLOW_BG_* guards after debug-trace sourcing with dbg annotations. sidecar-dispatch and sidecar-evaluate still place guards before debug-trace with no debug tracing on exit. This breaks the stated goal: "all hooks now follow same debug-trace sourcing pattern."

Fix: Move feedback-loop guards after debug-trace sourcing and add dbg "EXIT: bg_updater" etc. to match sidecar-capture.


3. devflow_debug_set_cwd Called Before CWD Validation (Consistency)

File: scripts/hooks/sidecar-capture:48 vs 52
Confidence: 85% (Consistency)

In sidecar-capture, devflow_debug_set_cwd "$CWD" is called at line 48, but CWD validation happens at line 52. All other 6 hooks validate first. When CWD is empty, this attempts to create ~/.devflow/logs/-/ (from sed 's|^/||' on empty string), violating the established pattern.

Fix: Swap lines so CWD validation comes before devflow_debug_set_cwd.


4. debug.ts Breaks Pure-Function Pattern (Architecture)

File: src/cli/commands/debug.ts:19-81
Confidence: 88% (Architecture)

The debug.ts command inlines the entire settings.json read-parse-mutate-write cycle instead of extracting pure functions. Every other settings-mutating command (flags.ts, ambient.ts, hud.ts) uses the pattern: applyX(settingsJson: string): string. This coupling makes the env mutation untestable without filesystem mocks, and the test file duplicates this logic locally rather than importing it.

Fix: Extract applyDebugTrace(settingsJson: string): string and stripDebugTrace(settingsJson: string): string as pure functions (similar to applyFlags/stripFlags). The command action becomes a thin I/O wrapper.


Should-Fix Issues (High Confidence, Non-Blocking)

5. Performance Regression: log-paths Sourced Before Early Exit (Performance)

File: scripts/hooks/sidecar-capture:59-62
Confidence: 85%

When memory:false is set, the hook now sources log-paths and calls devflow_log_dir (4 subprocess forks) before exiting at line 98-101, whereas previously it exited before sourcing. This adds 15-20ms to every stop-hook invocation on projects with memory:false.

Fix: Use lazy-init pattern—define log() only when first needed (after the memory gate).


6. Unsafe Type Assertion on Environment Values (TypeScript)

File: src/cli/commands/debug.ts:39
Confidence: 85%

The code asserts rawEnv as Record<string, string> without verifying that all values are actually strings. If settings.json contains "env": { "DEVFLOW_HOOK_DEBUG": 1 } (number), the assertion silently succeeds and the === '1' check fails.

Fix: Filter values to only include strings:

const env: Record<string, string> =
  (typeof rawEnv === 'object' && rawEnv !== null && !Array.isArray(rawEnv))
    ? Object.fromEntries(
        Object.entries(rawEnv as Record<string, unknown>)
          .filter((entry): entry is [string, string] => typeof entry[1] === 'string')
      )
    : {};

7. Mutating Input Object (TypeScript)

File: src/cli/commands/debug.ts:39-44
Confidence: 82%

The env variable is assigned directly from rawEnv without spreading. Mutations to env (adding DEVFLOW_HOOK_DEBUG = '1') mutate the original parsed object. The test file correctly uses spreading—production should match.

Fix: Apply spread operator: { ...(rawEnv as Record<string, unknown>) }.


Summary

Category Count
Blocking (must fix) 4
Should-Fix (recommended) 3
False Positives Cleared 6
Regression Tests Passing All

Recommendation: Fix the 4 blocking issues (5MB guard, feedback-loop guards, CWD validation order, pure functions pattern), then re-review debug.ts TypeScript surface for the type assertions.

Copy link
Copy Markdown
Owner Author

@dean0x dean0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing 5MB Size Guard on Phase 2 Debug Log

Line 46-59: devflow_debug_set_cwd creates the per-project debug log without any size guard.

The Phase 1 global log (lines 32-40) has a 5MB truncation guard that keeps the tail 2.5MB. Phase 2 (here) switches to per-project logging but omits this guard entirely. Since most debug output occurs post-CWD (Phase 2), the per-project logs will accumulate the bulk of debug traffic and grow without bound over extended debug sessions.

Fix: Add the same 5MB guard pattern after setting _DEVFLOW_DBG_LOG at line 55:

```bash
_DEVFLOW_DBG_LOG="$project_log_dir/.hook-debug.log"

Size guard: same as Phase 1

if [ -f "$_DEVFLOW_DBG_LOG" ] && [ "$(wc -c < "$_DEVFLOW_DBG_LOG" 2>/dev/null || echo 0)" -gt 5242880 ]; then
local _TAIL_TMP="$_DEVFLOW_DBG_LOG.tmp.$$"
if tail -c 2621440 "$_DEVFLOW_DBG_LOG" > "$_TAIL_TMP" 2>/dev/null; then
mv "$_TAIL_TMP" "$_DEVFLOW_DBG_LOG" 2>/dev/null || rm -f "$_TAIL_TMP" 2>/dev/null || true
else
rm -f "$_TAIL_TMP" 2>/dev/null || true
fi
fi
```

Impact: Without this, 100 debug-enabled sessions will create 100-150MB of logs with no cleanup.

Copy link
Copy Markdown
Owner Author

@dean0x dean0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback-Loop Guard Inconsistency

sidecar-dispatch lines 14-16: Feedback-loop guards (DEVFLOW_BG_* checks) are placed before debug-trace sourcing with no debug tracing on exit.

Compare to sidecar-capture (lines 21-23), which moved these guards to after debug-trace sourcing and added dbg exit annotations:
```bash
if [ "${DEVFLOW_BG_UPDATER:-}" = "1" ]; then dbg "EXIT: bg_updater"; exit 0; fi
```

This inconsistency breaks the PR's stated goal: "all hooks now follow same debug-trace sourcing pattern."

Fix: Move lines 14-16 to after line 12 (after devflow_debug_init and dbg HOOK START), and add dbg annotations to each guard:
```bash
devflow_debug_init "sidecar-dispatch"
dbg "=== HOOK START ==="

if [ "${DEVFLOW_BG_UPDATER:-}" = "1" ]; then dbg "EXIT: bg_updater"; exit 0; fi
if [ "${DEVFLOW_BG_LEARNER:-}" = "1" ]; then dbg "EXIT: bg_learner"; exit 0; fi
if [ "${DEVFLOW_BG_KNOWLEDGE_REFRESH:-}" = "1" ]; then dbg "EXIT: bg_knowledge"; exit 0; fi
```

Same fix needed in sidecar-evaluate lines 13-15.

Copy link
Copy Markdown
Owner Author

@dean0x dean0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CWD Validation Ordering

sidecar-capture line 48 vs 52: devflow_debug_set_cwd "\$CWD" is called at line 48, but CWD validation happens at line 52 (if [ -z "\$CWD" ] || [ ! -d "\$CWD" ]).

All other 6 hooks validate CWD first, then call devflow_debug_set_cwd. When CWD is empty, the current order attempts to create ~/.devflow/logs/-/ (from sed 's|^/||' on an empty string), which violates the established pattern and wastes a syscall.

Fix: Swap the order. Validation should occur before devflow_debug_set_cwd:
```bash
CWD=$(printf '%s' "$_FIELDS" | cut -d$'\001' -f1)
RESPONSE_TEXT=$(printf '%s' "$_FIELDS" | cut -d$'\001' -f2-)

if [ -z "$CWD" ] || [ ! -d "$CWD" ]; then dbg "EXIT: bad CWD"; exit 0; fi

devflow_debug_set_cwd "$CWD"
```

Copy link
Copy Markdown
Owner Author

@dean0x dean0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Architectural Violation: Break Pure-Function Pattern

src/cli/commands/debug.ts lines 19-81: The command inlines the entire settings.json read-parse-mutate-write cycle instead of extracting pure functions.

Every other settings-mutating command in devflow (flags.ts, ambient.ts, hud.ts) uses this pattern:

  • `applyX(settingsJson: string): string` — accepts raw JSON string, returns transformed string
  • `stripX(settingsJson: string): string` — reverse transform
  • Command action handler calls these pure functions and handles file I/O separately

Current problem:

  1. `debug.ts` couples file I/O, JSON parsing, env mutation, and writing into a single monolithic action handler
  2. The logic is untestable without filesystem mocks
  3. The test file re-implements this logic locally (`applyEnable`, `applyDisable`, `readDebugState`) rather than importing it — tests validate their copy, not production code
  4. The logic cannot be reused by `devflow init` or other orchestrators

Fix: Extract pure functions to a shared utility (e.g., `src/cli/utils/debug-settings.ts`):

```typescript
export function applyDebugTrace(settingsJson: string): string {
const settings = JSON.parse(settingsJson) as Record<string, unknown>;
const rawEnv = settings.env;
const env: Record<string, string> =
(typeof rawEnv === 'object' && rawEnv !== null && !Array.isArray(rawEnv))
? { ...(rawEnv as Record<string, unknown>) }
: {};
env.DEVFLOW_HOOK_DEBUG = '1';
settings.env = env;
return JSON.stringify(settings, null, 2) + '\n';
}

export function stripDebugTrace(settingsJson: string): string {
const settings = JSON.parse(settingsJson) as Record<string, unknown>;
const rawEnv = settings.env;
const env: Record<string, string> =
(typeof rawEnv === 'object' && rawEnv !== null && !Array.isArray(rawEnv))
? { ...(rawEnv as Record<string, unknown>) }
: {};
delete env.DEVFLOW_HOOK_DEBUG;
if (Object.keys(env).length === 0) delete settings.env;
else settings.env = env;
return JSON.stringify(settings, null, 2) + '\n';
}
```

Then the command action becomes:
```typescript
const raw = await fs.readFile(settingsPath, 'utf-8');
const updated = cmd === 'enable' ? applyDebugTrace(raw) : stripDebugTrace(raw);
await fs.writeFile(settingsPath, updated, 'utf-8');
```

And tests import and call the pure functions directly.

Copy link
Copy Markdown
Owner Author

@dean0x dean0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unsafe Type Assertion on Environment Values

src/cli/commands/debug.ts line 39: The code asserts `rawEnv as Record<string, string>` without validating that all values are actually strings.

Problem: If `settings.json` contains:
```json
{ "env": { "DEVFLOW_HOOK_DEBUG": 1 } }
```
(number instead of string), the `as` assertion silently succeeds and the `=== '1'` comparison at line 64 will never match, silently failing to detect the enabled state.

The `env` object in `settings.json` is a user-editable trust boundary and requires value-level validation.

Fix: Filter values to ensure only strings are included:

```typescript
const rawEnv = settings.env;
const env: Record<string, string> =
(typeof rawEnv === 'object' && rawEnv !== null && !Array.isArray(rawEnv))
? Object.fromEntries(
Object.entries(rawEnv as Record<string, unknown>)
.filter((entry): entry is [string, string] => typeof entry[1] === 'string')
)
: {};
```

This ensures non-string values (numbers, objects, etc.) are silently dropped and `env` contains only valid string key-value pairs.

Copy link
Copy Markdown
Owner Author

@dean0x dean0x left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mutating Input Object

src/cli/commands/debug.ts lines 39-44: The `env` variable is assigned directly from `rawEnv` without spreading, causing mutations to mutate the original parsed object.

Current code:
```typescript
const env = (typeof rawEnv === 'object' && rawEnv !== null && !Array.isArray(rawEnv))
? (rawEnv as Record<string, string>) // Direct reference, no copy
: {};
env.DEVFLOW_HOOK_DEBUG = '1'; // Mutates the original settings.env object
```

Test file does it correctly (line 37):
```typescript
? { ...(rawEnv as Record<string, string>) } // Spread creates a copy
```

While functionally correct for the current write-back flow, this violates immutability by default principle. Production code should match the test pattern.

Fix: Apply spread operator:
```typescript
const env: Record<string, string> =
(typeof rawEnv === 'object' && rawEnv !== null && !Array.isArray(rawEnv))
? { ...(rawEnv as Record<string, unknown>) }
: {};
```

Dean Sharon and others added 5 commits May 27, 2026 22:32
…tch/evaluate

Three consistency fixes across sidecar-dispatch and sidecar-evaluate (batch-4):

1. Consistent guard syntax: convert sidecar-evaluate feedback-loop guards from
   short-circuit form to if/then/fi matching sidecar-dispatch. Add explanatory
   comment to both hooks: "Feedback-loop guards — before debug-trace to minimize
   background session overhead". Guards intentionally stay pre-debug-trace so
   background agent sessions exit without sourcing debug-trace overhead.

2. Missing dbg annotations: add dbg "EXIT: no json" and dbg "EXIT: bad CWD"
   to the two post-debug-trace early-exit points in sidecar-dispatch, matching
   sidecar-capture style.

3. CWD validation: strengthen sidecar-evaluate CWD check from [ -z "$CWD" ]
   to [ -z "$CWD" ] || [ ! -d "$CWD" ] matching the 3 other hooks that check
   both conditions. sidecar-dispatch already had the dual check.

Co-Authored-By: Claude <[email protected]>
… - Extract applyDebugTrace/stripDebugTrace/readDebugStatus as pure string->string functions following the applyFlags/stripFlags pattern from flags.ts - Fix unsafe rawEnv cast: filter entries with type guard (Object.fromEntries + type predicate) instead of bare as Record<string, string> - Fix mutation bug: spread rawEnv into a fresh object before modifying it - matches what the tests already did correctly - Reorder CLI option processing to status->enable->disable, consistent with learn.ts and decisions.ts - Rewrite tests to import and call the real pure functions directly - removes the three duplicate helpers (applyEnable/applyDisable/readDebugState) that mirrored production logic - Add missing malformed-JSON test for the disable path (parallel to enable) applies ADR-007 Co-Authored-By: Claude <[email protected]>
…eliability fixes to debug-trace: 1. Extract size guard logic into _devflow_dbg_size_guard() helper so both devflow_debug_init (Phase 1 global log) and devflow_debug_set_cwd (Phase 2 per-project log) share the same 5MB-cap/2.5MB-tail truncation logic. Fixes the missing guard on the per-project log path. 2. Replace wc -c subprocess with stat -f%z (macOS/BSD) / stat -c%s (GNU/Linux) in the size guard, with wc -c as last-resort fallback. Avoids an unnecessary subprocess fork on every debug-enabled hook invocation. Adds regression test: devflow_debug_set_cwd truncates per-project log when it exceeds 5MB, confirming the new guard fires and the new message is retained. Co-Authored-By: Claude <[email protected]>
…s corrected: 1. Move devflow_debug_set_cwd after CWD validation -- was called before the empty/non-directory CWD guard, inconsistent with all 6 other hooks (sidecar-dispatch pattern). Calling with an invalid CWD would silently set a bad Phase 2 log path. 2. Move log-paths sourcing after MEMORY_ENABLED gate -- was sourced before the memory:false early-exit, paying subprocess overhead on every stop hook invocation regardless of memory state. log() is only called in the memory-specific section, so deferring to after the gate is safe (no log() calls in the decisions scanner path). Co-Authored-By: Claude <[email protected]>
- Change || echo "/tmp" to || echo "${TMPDIR:-/tmp}" in all 6 hooks so the
  fallback prefers the user-private temp directory on macOS (TMPDIR=/var/folders/...)
  instead of world-readable /tmp with predictable filenames.

- Add a once-per-invocation log size guard after LOG_FILE is set in each hook:
  truncates to tail 1MB when the log exceeds 2MB, using a PID-unique temp file
  for atomicity (same pattern as debug-trace 5MB guard). Guard runs once at
  hook init time, not on every log() call, so overhead is negligible.

Hooks updated: pre-compact-memory, session-start-context, session-start-memory,
sidecar-evaluate. (sidecar-capture and sidecar-dispatch were already fixed in
a prior commit on this branch.)

Co-Authored-By: Claude <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant