Skip to content

rename: post_mlp -> post_block hook point (mechanical, no behavior change)#175

Draft
maxsloef-goodfire wants to merge 1 commit into
RhizoNymph:feat/integrationfrom
maxsloef-goodfire:post-block-rename
Draft

rename: post_mlp -> post_block hook point (mechanical, no behavior change)#175
maxsloef-goodfire wants to merge 1 commit into
RhizoNymph:feat/integrationfrom
maxsloef-goodfire:post-block-rename

Conversation

@maxsloef-goodfire

Copy link
Copy Markdown

Mechanical token rename: post_mlppost_block across code, tests, docs, and examples (126 files). No behavior change — same tensors captured and steered as before.

Why: the hook fires after the mlp() call in program order, but in deferred-residual architectures the residual it sees does not yet include the MLP contribution (the add happens inside the next layer's fused add+norm) — so at this hook point post_mlp is byte-identical to post_attn and the name misdescribes the dataflow. post_block names the position in the layer rather than making a dataflow claim.

This is PR 1 of a stack — the semantic correction (actually capturing the true block output residual + hidden_states, plus capture-path fixes and features) is in #174, which is rebased on top of this branch. Reviewing this one is just confirming the rename is mechanical; the substantive diff in #174 is then only ~60 files.

API note: capture specs sending "post_mlp" must switch to "post_block" (pre-release branch; no alias kept — happy to add a deprecation alias if you'd prefer).

🤖 Generated with Claude Code

…ange)

Token-level rename across code, tests, docs, and examples. The hook fires
after the mlp() call in program order, but in deferred-residual
architectures the captured/steered `residual` does not yet include the
MLP contribution (the add happens in the NEXT layer's fused add+norm) --
so 'post_mlp' misdescribes the dataflow. 'post_block' names the position
in the layer, not a dataflow claim.

No functional change: same tensors captured/steered as before. The
semantic correction (capturing the true block output, residual + hidden)
is stacked on top of this PR. Clients sending 'post_mlp' in capture specs
must switch to 'post_block' (pre-release branch; no alias kept).

Co-Authored-By: Claude Fable 5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant