Add segment reader pinning (refcount + RAII item guards)#25
Open
brayniac wants to merge 4 commits into
Open
Conversation
Add a ref_count field (fits in existing padding; header stays 64 bytes) with a two-phase reader acquisition protocol ported from crucible's SliceSegment: check the state is readable, increment, then re-check and back out if a writer transitioned the segment in between. Nothing acquires pins yet; eviction gating follows. The original implementation waited on a segment refcount before clearing (see the 'skips over seg_wait_refcount' note in Segment::clear); this restores that machinery in crucible's form. Two loom models cover the protocol: readers racing a writer that mirrors the production eviction gate, and acquisition failing in every interleaving once a segment is draining. Co-Authored-By: Claude Fable 5 <[email protected]>
Add SegmentGuard, an RAII pin on a segment's reader count, and a Segments::acquire_item_at that performs the two-phase acquisition before handing out a RawItem. The four read paths that return or mutate items (get, get_no_freq_incr, wrapping_add, saturating_sub) now construct Items that carry the guard, so the segment backing an Item is pinned for exactly as long as the Item is alive. No behavior change yet: nothing consults the reader count when selecting segments for eviction or recycling. That gating follows. Co-Authored-By: Claude Fable 5 <[email protected]>
can_evict() now requires a zero reader count, which gates every eviction selection path: Random, the Fifo/Cte/Util ranking and its least_valuable_seg re-check, merge candidate chains (a pinned destination zeroes the chain length; a pinned source stops the pass), S3-FIFO selection, and the empty-segment fast-free in remove_at. clear_segment() additionally fails cleanly on a pinned segment, covering RandomFifo's head selection, and push_free() debug-asserts the invariant. A pinned segment is simply not chosen this pass and is reclaimed by a later pass once its readers drop — the deferred-release handoff (AwaitingRelease) arrives with the full state machine port. The churn test pins one segment and pushes ~10x the heap through a Fifo cache: the held value must stay byte-stable and the key's CAS token (location + generation) unchanged. Verified the test fails without the can_evict() gate. Co-Authored-By: Claude Fable 5 <[email protected]>
TtlBucket::expire and clear were head-pop loops that always freed the segment they drained — incompatible with reader pins. Both are now cursor walks: every visited segment is drained from the hashtable (Segment::clear is idempotent on revisit), but only unpinned segments are freed. A drained-but-pinned segment stays linked in the chain and is reclaimed by a later expire/clear pass once its readers drop — this approximates crucible's AwaitingRelease handoff until the full state machine is ported. Return values now count only segments actually freed. TtlBucket::reserve previously spun forever on an inaccessible tail (impossible before; possible now that a drained-pinned segment can remain as tail) — an inaccessible tail now falls through to expansion, linking a fresh segment after it. Verified all three behavioral tests fail when ref_count is neutralized to zero (gates blind), and pass with it live. Co-Authored-By: Claude Fable 5 <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
First step of making segcache concurrent-safe (and a latent bug fix today): an
Itemholds a raw pointer into segment memory with no lifetime tie, so a held Item could be silently invalidated by eviction — in-placecompact()memmoves,copy_intorelocation, or segment recycling — triggered by subsequent cache calls. This ports crucible's reader-pinning machinery: a per-segmentref_countwith a two-phase acquire protocol, and an RAIISegmentGuardcarried by everyItem, so a segment cannot be touched while readers hold items into it.Archaeology:
Segment::clearstill carries the comment "skips over seg_wait_refcount … because no threading" — the original implementation had exactly this wait, dropped in the single-threaded port. This restores it in crucible's form.Design
SegmentHeadergainsref_count: AtomicU32(fits in padding, still 64 bytes) with crucible's two-phase acquisition: check state → increment → re-check → back out on failure. Items acquired via the four read paths (get,get_no_freq_incr,wrapping_add,saturating_sub) carry the guard.can_evict()requires zero readers, which gates every eviction selection path (Random/Fifo/Cte/Util ranking, merge chains, S3-FIFO);clear_segmentfails cleanly on a pinned segment (covers RandomFifo);push_freedebug-asserts the invariant.TtlBucket::{expire, clear}become cursor walks: every visited segment is drained from the hashtable, but pinned segments stay linked and are reclaimed by a later pass once readers drop — approximating crucible's AwaitingRelease handoff until the full segment state machine is ported (next step in the concurrency sequence).reserveno longer spins on an inaccessible tail (newly reachable state); it expands past it.segment_pinned_skipmetric counts deferred reclamations.Eviction-side internals (which hold
&mut Segments) intentionally keep using unpinned access; the writer-side CAS state transitions arrive with the state-machine port, as noted in the loom model comments.Testing
clear()stay readable while the hashtable drains, and a laterclear()reclaims; numeric-op items pin like any other.ref_count()is neutralized to always return 0 (gates blind) and with thecan_evict()gate removed.cargo test -p segcache(53 tests),--features debug, loom suite (10 models), workspace incl. doctests,clippy --all-targets --all-features -- -D warnings,fmt --checkall green.Sequencing
Built on #24's generation tokens (rebased onto main post-merge). Next steps in the concurrency sequence: segment state machine (crucible's 9 states, replacing plain state stores with CAS transitions and adding the AwaitingRelease deferred-release handoff), then the lock-free free queue and concurrent reserve path.
🤖 Generated with Claude Code