TL;DR
Reth v2.0.0 with --full --storage.v2 --prune.sender-recovery.distance 2000000 deterministically panics on check_consistency for the TransactionSenders segment after a fresh snapshot import. PR #21918 (merged 2026-02-06) fixed this for --prune.sender-recovery.full but did not cover the Distance variant. PR #21636 was authored specifically for this case by @gakonst, closed as stale 2026-02-25 without merging, and implements the exact fix. I extracted its 29-line logic change, adapted it to v2.0.0's current ensure_invariants function, rebuilt reth locally, and confirmed the node starts cleanly and runs the full sync pipeline (headers → bodies → SenderRecovery → execution → merkle → ... → finish) without any panic. The fix works. Please either reopen #21636 or cherry-pick its logic into main.
What happens
Reth v2.0.0 with --full --storage.v2 --prune.sender-recovery.distance 2000000 panics deterministically on the TransactionSenders consistency check approximately 25 minutes after startup. The panic occurs every time, including after a full wipe-and-re-download from a fresh snapshot. The node cannot stay running.
The panic site is crates/node/builder/src/launch/common.rs:533:13:
thread 'main' panicked at crates/node/builder/src/launch/common.rs:533:13:
assertion `left != right` failed: A static file inconsistency was found that would trigger an unwind to block 0
left: 0
right: 0
The same assertion fires at crates/cli/commands/src/common.rs:214:13 for any reth db or reth stage subcommand that touches the database — there is no escape hatch short of reth db stats --skip-consistency-checks.
Reproducer
- Perform a fresh snapshot import on a clean datadir:
reth download --chain mainnet --full --storage.v2 -y
- Start reth with:
reth node \
--datadir /path/to/datadir \
--storage.v2 \
--full \
--prune.block-interval 100 \
--prune.sender-recovery.distance 2000000 \
--prune.transaction-lookup.distance 2000000 \
--prune.receipts.distance 2000000 \
--prune.account-history.distance 2000000 \
--prune.storage-history.distance 2000000 \
--prune.bodies.distance 2000000
- Let a connected consensus client (Lighthouse, in this case) push forkchoice updates.
- After approximately 25 minutes, reth emits a
Failed to load static file jar warning for the transaction-senders segment covering the current tip, then panics.
- On the next startup (or on any
reth db/reth stage invocation), the consistency check finds the 0-byte placeholder files Reth wrote during the previous startup, computes unwind_target=0, and panics again. The node is now in a permanent crash loop that survives across full wipes.
This has been reproduced twice on identical hardware from two independent snapshot downloads — once on 2026-04-09 after upgrading to v2.0.0, and again on 2026-04-10 after a full wipe-and-redownload recovery attempt.
Consistency check output at panic time
In read-only mode (e.g., reth db stats), the check logs rather than panics. The output is:
INFO check_consistency{read_only=true}: Verifying storage consistency.
INFO check_consistency{read_only=true}: Checking consistency for segment{segment=TransactionSenders}:
INFO ensure_invariants{
highest_static_file_entry=None
highest_static_file_block=None
table="TransactionSenders"
}: Setting unwind target. checkpoint_block_number=24850381 unwind_target=0
WARN Inconsistent storage. Restart node to heal. unwind_target=Unwind(0)
In write mode (reth node, reth stage unwind, reth db stage-checkpoints set), the same code path escalates to the assertion panic shown above.
On-disk state at crash
Static files directory listing (ls -la /path/to/datadir/static_files/static_file_transaction-senders*):
-rw-r--r-- 1 _unknown _unknown 0 static_file_transaction-senders_0_49999
-rw-r--r-- 1 _unknown _unknown 55 static_file_transaction-senders_0_49999.conf
-rw-r--r-- 1 _unknown _unknown 9 static_file_transaction-senders_0_49999.off
-rw-r--r-- 1 _unknown _unknown 0 static_file_transaction-senders_24850000_24899999
-rw-r--r-- 1 _unknown _unknown 55 static_file_transaction-senders_24850000_24899999.conf
-rw-r--r-- 1 _unknown _unknown 9 static_file_transaction-senders_24850000_24899999.off
Both data files are 0 bytes. The .conf and .off sidecars have the standard initialization sizes but no real content.
Notably, the segment covering all blocks between 50000 and 24799999 — where corresponding Transactions segment files do exist — is entirely absent. There are no tx-senders files for the full synced range.
The segment at the tip advances between crashes. Crash 1 produced 24800000_24849999; after wipe-and-redownload, crash 2 produced 24850000_24899999. This is deterministic with the snapshot tip, not random corruption.
MDBX state
From reth db --datadir /path/to/datadir stats --skip-consistency-checks:
- Static file catalog (MDBX): 5 segment types —
Headers, Transactions, Receipts, AccountChangeSets, StorageChangeSets. TransactionSenders is not in the catalog at all. The MDBX metadata does not believe any tx-senders static files exist.
TransactionSenders MDBX table: 0 entries, 0 bytes — fully distance-pruned from MDBX, as configured.
SenderRecovery stage checkpoint: StageCheckpoint { block_number: 24850381, stage_checkpoint: None }
PruneSenderRecovery checkpoint: mode=Distance(2000000), block=24850381
- RocksDB
TransactionHashNumbers: 509M entries, ~23 GiB — transaction hashes are NOT pruned (only senders)
The inconsistency in summary: MDBX says SenderRecovery completed to block 24,850,381, the catalog says no tx-senders static files exist, and the filesystem says two 0-byte files exist. These three facts are irreconcilable to check_consistency.
The trigger sequence
Based on log analysis:
- Reth starts from the snapshot at block 24,850,381. Lighthouse begins pushing forkchoice updates. Reth does not advance any blocks but does run background pipeline work (e.g.,
TransactionLookup rebuilds ~1.4M entries during the 25-minute window).
- ~25 minutes in, Reth attempts to read the
transaction-senders static file for the segment covering the current tip:
WARN Failed to load static file jar ... Os { code: 2, kind: NotFound, message: "No such file or directory" },
path: ".../static_files/static_file_transaction-senders_24800000_24849999.conf"
- The engine-tree error chain propagates up:
ERROR engine::persistence: Persistence service failed...
ERROR engine::tree: Fatal error in consensus engine...
INFO reth::cli: Fatal error in consensus engine...shutting down
- During shutdown, Reth's "three-way healing" creates the 0-byte placeholder files at the expected segment boundaries (
0_49999 and the tip segment).
- On next startup,
check_consistency walks the static files directory, finds the 0-byte placeholders, sees highest_static_file_block=None because the files have no rows, compares to checkpoint_block_number=24850381, computes unwind_target=0, and panics.
- All subsequent startups reproduce step 5 because Reth also recreates the
0_49999 placeholder as a startup artifact.
Root cause analysis
The chain of causation points to a gap in segments_to_check() / should_check_segment() in the static file manager (approximately crates/storage/provider/src/providers/static_file/manager.rs):
check_consistency iterates segments_to_check().
should_check_segment() skips TransactionSenders only when is_segment_fully_pruned(SenderRecovery) returns true.
is_segment_fully_pruned appears to return true only when the prune mode is Full — i.e., when --prune.sender-recovery.full is set.
- With
--prune.sender-recovery.distance 2000000, the stored prune mode is Distance(2000000), which does not satisfy the Full check.
- So
check_consistency runs for TransactionSenders, finds no static files, and panics — even though this config never wrote any tx-senders to static files in the first place.
The deeper issue is that the snapshot import procedure sets SenderRecovery stage checkpoint to the snapshot tip but does not write any TransactionSenders static files. This is the correct behavior for a --full + distance config (distance pruning keeps senders in MDBX for the recent window, not in static files). But check_consistency does not know this: it sees a non-zero stage checkpoint and a missing static file catalog entry, and concludes the data was lost.
What I tried that does not work
| Approach |
Result |
| Quarantine the 0-byte placeholder files |
reth node recreates them on startup and hits the same panic |
reth stage unwind --datadir /path to-block <N> |
reth stage runs check_consistency at startup and panics before doing any unwind |
reth db --datadir /path stage-checkpoints set --stage sender-recovery 0 |
Same: write-mode consistency check runs before accepting any arguments, panics |
reth node --prune.sender-recovery.full on the existing datadir |
Saves the new prune config to reth.toml but does NOT update the PruneSenderRecovery entry in MDBX, so check_consistency still reads Distance(2000000) and panics |
Full wipe + fresh snapshot (reth download --full --storage.v2 -y) |
Produces a clean state that runs for ~25 minutes and then crashes with the same panic at the next 50k-block segment boundary. Reproduced twice. |
The only command that can inspect the broken datadir without panicking is reth db --datadir /path stats --skip-consistency-checks. There is no equivalent escape hatch for reth node, reth stage, or reth db stage-checkpoints set.
Related issues
The difference between what PR #21918 fixed and what this issue describes is the prune mode: Full vs Distance. The fix in #21918 added a skip path for is_segment_fully_pruned, but distance-pruned senders are not "fully pruned" by that definition even though they are also never written to static files.
The already-existing fix: PR #21636
PR #21636 "fix(storage): respect prune checkpoint in static file consistency check" was authored by @gakonst (branch joshie/fix-static-file-prune-unwind) specifically for this bug. The PR body describes it word-for-word:
Root cause: When --full prunes segments like TransactionSenders, the static files are deleted but the stage checkpoint remains high. On next startup, the consistency check in ensure_invariants sees checkpoint_block_number > highest_static_file_block (e.g., 22M > 0) and assumes data corruption, triggering an unwanted unwind.
The PR's approach: add + PruneCheckpointReader to the ensure_invariants trait bound, and change the naive if checkpoint_block_number > highest_static_file_block comparison to use effective_available_block = max(highest_static_file_block, prune_checkpoint_block). For distance-pruned segments, the prune checkpoint's block_number is the tip, so effective_available_block equals the stage checkpoint and no unwind is triggered. The PR includes a green unit test (test_consistency_respects_prune_checkpoint) that seeds the exact failure mode.
PR #21636 was closed 2026-02-25 by @emmajam with the comment "Hey! We're doing some spring cleaning on our PR backlog 🧹 Closing old PRs to keep things tidy. If this is still relevant, please feel free to re-open" — it was never merged. It is still relevant: v2.0.0 shipped 2026-04-08 with the bug still present.
I confirmed PR #21636 fixes this bug
I extracted PR #21636's logic change (the full diff does not apply cleanly to v2.0.0 because v2.0.0 already landed PR #21918's prerequisite imports), hand-adapted it to v2.0.0's current ensure_invariants in crates/storage/provider/src/providers/static_file/manager.rs, rebuilt reth from source with default release features, and bootstrapped my previously-broken node. Results:
- Startup
check_consistency passes cleanly. No unwind_target=Unwind(0) warning, no panic.
- Full pipeline cycle completes. Stages 1–14 advanced 24850381 → 24851130 (the snapshot tip + new blocks). Specifically the
SenderRecovery stage (stage 3/14) ran in 17 seconds and wrote 275,698 sender rows to static_file_transaction-senders static files (498 writer opens, 497 commits, verified via reth_static_files_jar_provider_calls_total{segment="transaction-senders",operation="append"}). This is the exact code path that previously panicked the node at runtime.
- Second pipeline cycle immediately started. Headers stage advanced to block 24858908 (7778 more blocks).
eth_blockNumber advanced from 0x17b2fcd (24850381) to 0x17b32ba (24851130) and continues climbing.
- Zero panics in
reth.daemon.err.log after 30+ minutes of continuous operation across two full pipeline cycles.
The adapted diff I applied (the minimum change for v2.0.0) is:
fn ensure_invariants_for<Provider>(
...
where
- Provider: DBProvider + BlockReader + StageCheckpointReader,
+ Provider: DBProvider + BlockReader + StageCheckpointReader + PruneCheckpointReader,
N: NodePrimitives<Receipt: Value, BlockHeader: Value, SignedTx: Value>,
fn ensure_invariants<Provider, T: Table<Key = u64>>(
...
where
- Provider: DBProvider + BlockReader + StageCheckpointReader,
+ Provider: DBProvider + BlockReader + StageCheckpointReader + PruneCheckpointReader,
// inside ensure_invariants, replacing the naive comparison:
+ let prune_segment = match segment {
+ StaticFileSegment::TransactionSenders => Some(PruneSegment::SenderRecovery),
+ StaticFileSegment::Receipts => Some(PruneSegment::Receipts),
+ _ => None,
+ };
+ let effective_available_block = if let Some(ps) = prune_segment {
+ let prune_checkpoint_block =
+ provider.get_prune_checkpoint(ps)?.and_then(|c| c.block_number).unwrap_or(0);
+ std::cmp::max(highest_static_file_block, prune_checkpoint_block)
+ } else {
+ highest_static_file_block
+ };
- if checkpoint_block_number > highest_static_file_block {
+ if checkpoint_block_number > effective_available_block {
info!(
target: "reth::providers::static_file",
checkpoint_block_number,
- unwind_target = highest_static_file_block,
+ unwind_target = effective_available_block,
?segment,
"Setting unwind target."
);
- return Ok(Some(highest_static_file_block));
+ return Ok(Some(effective_available_block));
}
29 insertions, 6 deletions, one file. Identical in behavior to the ensure_invariants portion of PR #21636 — I did NOT port the ensure_invariants_from_db portion of #21636 since v2.0.0 renamed that function to ensure_changeset_invariants_by_block and my bug doesn't exercise it. For a full upstream fix, that portion should also be ported.
Proposed fixes (in order of preference)
Option 1 (preferred): Reopen PR #21636 or cherry-pick its logic into main. The work is already done and I've verified it works on v2.0.0. PR #21636's approach (use max(highest_static_file_block, prune_checkpoint_block) as the effective "data available from" threshold) is cleaner than a skip-path extension because it also handles the Receipts variant of the same bug and any future prune-aware segment.
Option 2: Extend PR #21918's skip path to cover distance-pruned senders. Add a Distance branch to is_segment_fully_pruned (or introduce is_segment_in_non_static_file_storage). Simpler than #21636 but narrower — only fixes TransactionSenders, not Receipts.
Option 3: Migrate prune config from reth.toml to MDBX PruneCheckpoints before running check_consistency. Lets an operator recover by changing --prune.sender-recovery.distance → --prune.sender-recovery.full in their config and restarting, without hand-patching the binary. Useful as an escape hatch independent of the main fix.
Option 4: Add --skip-consistency-checks to reth db stage-checkpoints set and reth stage unwind. Gives operators a manual escape hatch to repair MDBX state without rebuilding reth. Currently --skip-consistency-checks exists only on reth db stats and is read-only. Even a one-time reth db stage-checkpoints set --skip-consistency-checks --stage sender-recovery --block-number 0 would have let me recover without the ~12 hours of diagnostic + rebuild work this session took.
Option 5 (minimum): Improve diagnostics. The current error — assertion left != right failed with both sides equal to 0 — gives operators nothing to act on. A log line that says "TransactionSenders static files are missing but the SenderRecovery stage checkpoint is at block N; if you are using distance pruning, this is likely a known bug — see issue #XXXXX" would meaningfully reduce diagnostic burden.
Environment
- OS: macOS 25 (Darwin 25.3.0), Apple Silicon
- Reth version: v2.0.0, commit
eb4c15e5, built 2026-04-07
- Chain: mainnet
- Storage format: Storage v2 (
--storage.v2)
- Consensus client: Lighthouse v8.1.3 (healthy, not involved in the crash)
- Datadir size: ~390 GB (210 GB MDBX + 23 GB RocksDB + remainder headers/transactions/receipts static files)
- Available disk: 3.2 TB free on a 3.6 TB volume
Full reth node invocation (from startup script):
reth node \
--datadir /Volumes/ETHDATA/reth \
--storage.v2 \
--http \
--http.addr 127.0.0.1 \
--http.api eth,net,web3 \
--ws.addr 127.0.0.1 \
--authrpc.addr 127.0.0.1 \
--authrpc.port 8551 \
--authrpc.jwtsecret /Volumes/ETHDATA/reth/jwt.hex \
--metrics 127.0.0.1:9001 \
--full \
--prune.block-interval 100 \
--prune.sender-recovery.distance 2000000 \
--prune.transaction-lookup.distance 2000000 \
--prune.receipts.distance 2000000 \
--prune.account-history.distance 2000000 \
--prune.storage-history.distance 2000000 \
--prune.bodies.distance 2000000
Confirmed not a config typo or one-off: two independent snapshot downloads on the same hardware reproduced the same panic at the same assertion. The trigger is the first prune cycle or segment rotation after startup, not random corruption.
TL;DR
Reth v2.0.0 with
--full --storage.v2 --prune.sender-recovery.distance 2000000deterministically panics oncheck_consistencyfor theTransactionSenderssegment after a fresh snapshot import. PR #21918 (merged 2026-02-06) fixed this for--prune.sender-recovery.fullbut did not cover theDistancevariant. PR #21636 was authored specifically for this case by @gakonst, closed as stale 2026-02-25 without merging, and implements the exact fix. I extracted its 29-line logic change, adapted it to v2.0.0's currentensure_invariantsfunction, rebuilt reth locally, and confirmed the node starts cleanly and runs the full sync pipeline (headers → bodies → SenderRecovery → execution → merkle → ... → finish) without any panic. The fix works. Please either reopen #21636 or cherry-pick its logic into main.What happens
Reth v2.0.0 with
--full --storage.v2 --prune.sender-recovery.distance 2000000panics deterministically on theTransactionSendersconsistency check approximately 25 minutes after startup. The panic occurs every time, including after a full wipe-and-re-download from a fresh snapshot. The node cannot stay running.The panic site is
crates/node/builder/src/launch/common.rs:533:13:The same assertion fires at
crates/cli/commands/src/common.rs:214:13for anyreth dborreth stagesubcommand that touches the database — there is no escape hatch short ofreth db stats --skip-consistency-checks.Reproducer
Failed to load static file jarwarning for thetransaction-senderssegment covering the current tip, then panics.reth db/reth stageinvocation), the consistency check finds the 0-byte placeholder files Reth wrote during the previous startup, computesunwind_target=0, and panics again. The node is now in a permanent crash loop that survives across full wipes.This has been reproduced twice on identical hardware from two independent snapshot downloads — once on 2026-04-09 after upgrading to v2.0.0, and again on 2026-04-10 after a full wipe-and-redownload recovery attempt.
Consistency check output at panic time
In read-only mode (e.g.,
reth db stats), the check logs rather than panics. The output is:In write mode (
reth node,reth stage unwind,reth db stage-checkpoints set), the same code path escalates to the assertion panic shown above.On-disk state at crash
Static files directory listing (
ls -la /path/to/datadir/static_files/static_file_transaction-senders*):Both data files are 0 bytes. The
.confand.offsidecars have the standard initialization sizes but no real content.Notably, the segment covering all blocks between
50000and24799999— where correspondingTransactionssegment files do exist — is entirely absent. There are no tx-senders files for the full synced range.The segment at the tip advances between crashes. Crash 1 produced
24800000_24849999; after wipe-and-redownload, crash 2 produced24850000_24899999. This is deterministic with the snapshot tip, not random corruption.MDBX state
From
reth db --datadir /path/to/datadir stats --skip-consistency-checks:Headers,Transactions,Receipts,AccountChangeSets,StorageChangeSets.TransactionSendersis not in the catalog at all. The MDBX metadata does not believe any tx-senders static files exist.TransactionSendersMDBX table: 0 entries, 0 bytes — fully distance-pruned from MDBX, as configured.SenderRecoverystage checkpoint:StageCheckpoint { block_number: 24850381, stage_checkpoint: None }PruneSenderRecoverycheckpoint:mode=Distance(2000000),block=24850381TransactionHashNumbers: 509M entries, ~23 GiB — transaction hashes are NOT pruned (only senders)The inconsistency in summary: MDBX says SenderRecovery completed to block 24,850,381, the catalog says no tx-senders static files exist, and the filesystem says two 0-byte files exist. These three facts are irreconcilable to
check_consistency.The trigger sequence
Based on log analysis:
TransactionLookuprebuilds ~1.4M entries during the 25-minute window).transaction-sendersstatic file for the segment covering the current tip:0_49999and the tip segment).check_consistencywalks the static files directory, finds the 0-byte placeholders, seeshighest_static_file_block=Nonebecause the files have no rows, compares tocheckpoint_block_number=24850381, computesunwind_target=0, and panics.0_49999placeholder as a startup artifact.Root cause analysis
The chain of causation points to a gap in
segments_to_check()/should_check_segment()in the static file manager (approximatelycrates/storage/provider/src/providers/static_file/manager.rs):check_consistencyiteratessegments_to_check().should_check_segment()skipsTransactionSendersonly whenis_segment_fully_pruned(SenderRecovery)returnstrue.is_segment_fully_prunedappears to returntrueonly when the prune mode isFull— i.e., when--prune.sender-recovery.fullis set.--prune.sender-recovery.distance 2000000, the stored prune mode isDistance(2000000), which does not satisfy theFullcheck.check_consistencyruns forTransactionSenders, finds no static files, and panics — even though this config never wrote any tx-senders to static files in the first place.The deeper issue is that the snapshot import procedure sets
SenderRecoverystage checkpoint to the snapshot tip but does not write anyTransactionSendersstatic files. This is the correct behavior for a--full + distanceconfig (distance pruning keeps senders in MDBX for the recent window, not in static files). Butcheck_consistencydoes not know this: it sees a non-zero stage checkpoint and a missing static file catalog entry, and concludes the data was lost.What I tried that does not work
reth noderecreates them on startup and hits the same panicreth stage unwind --datadir /path to-block <N>reth stagerunscheck_consistencyat startup and panics before doing any unwindreth db --datadir /path stage-checkpoints set --stage sender-recovery 0reth node --prune.sender-recovery.fullon the existing datadirreth.tomlbut does NOT update thePruneSenderRecoveryentry in MDBX, socheck_consistencystill readsDistance(2000000)and panicsreth download --full --storage.v2 -y)The only command that can inspect the broken datadir without panicking is
reth db --datadir /path stats --skip-consistency-checks. There is no equivalent escape hatch forreth node,reth stage, orreth db stage-checkpoints set.Related issues
--minimalwith senders in static files cannot handle completely missing senders #21914 — Identical panic signature:assertion left != right failed: A static file inconsistency... unwind_target=0onTransactionSenders. Reported by a maintainer, closed via PR fix: skip sender recovery stage when senders fully pruned #21918 (merged 2026-02-06). That fix moves sender recovery inline when senders are fully pruned (--prune.sender-recovery.full/--minimal). It does not cover--full + --prune.sender-recovery.distance.The difference between what PR #21918 fixed and what this issue describes is the prune mode:
FullvsDistance. The fix in #21918 added a skip path foris_segment_fully_pruned, but distance-pruned senders are not "fully pruned" by that definition even though they are also never written to static files.The already-existing fix: PR #21636
PR #21636 "fix(storage): respect prune checkpoint in static file consistency check" was authored by @gakonst (branch
joshie/fix-static-file-prune-unwind) specifically for this bug. The PR body describes it word-for-word:The PR's approach: add
+ PruneCheckpointReaderto theensure_invariantstrait bound, and change the naiveif checkpoint_block_number > highest_static_file_blockcomparison to useeffective_available_block = max(highest_static_file_block, prune_checkpoint_block). For distance-pruned segments, the prune checkpoint'sblock_numberis the tip, soeffective_available_blockequals the stage checkpoint and no unwind is triggered. The PR includes a green unit test (test_consistency_respects_prune_checkpoint) that seeds the exact failure mode.PR #21636 was closed 2026-02-25 by @emmajam with the comment "Hey! We're doing some spring cleaning on our PR backlog 🧹 Closing old PRs to keep things tidy. If this is still relevant, please feel free to re-open" — it was never merged. It is still relevant: v2.0.0 shipped 2026-04-08 with the bug still present.
I confirmed PR #21636 fixes this bug
I extracted PR #21636's logic change (the full diff does not apply cleanly to v2.0.0 because v2.0.0 already landed PR #21918's prerequisite imports), hand-adapted it to v2.0.0's current
ensure_invariantsincrates/storage/provider/src/providers/static_file/manager.rs, rebuilt reth from source with default release features, and bootstrapped my previously-broken node. Results:check_consistencypasses cleanly. Nounwind_target=Unwind(0)warning, no panic.SenderRecoverystage (stage 3/14) ran in 17 seconds and wrote 275,698 sender rows tostatic_file_transaction-sendersstatic files (498 writer opens, 497 commits, verified viareth_static_files_jar_provider_calls_total{segment="transaction-senders",operation="append"}). This is the exact code path that previously panicked the node at runtime.eth_blockNumberadvanced from0x17b2fcd(24850381) to0x17b32ba(24851130) and continues climbing.reth.daemon.err.logafter 30+ minutes of continuous operation across two full pipeline cycles.The adapted diff I applied (the minimum change for v2.0.0) is:
fn ensure_invariants_for<Provider>( ... where - Provider: DBProvider + BlockReader + StageCheckpointReader, + Provider: DBProvider + BlockReader + StageCheckpointReader + PruneCheckpointReader, N: NodePrimitives<Receipt: Value, BlockHeader: Value, SignedTx: Value>, fn ensure_invariants<Provider, T: Table<Key = u64>>( ... where - Provider: DBProvider + BlockReader + StageCheckpointReader, + Provider: DBProvider + BlockReader + StageCheckpointReader + PruneCheckpointReader, // inside ensure_invariants, replacing the naive comparison: + let prune_segment = match segment { + StaticFileSegment::TransactionSenders => Some(PruneSegment::SenderRecovery), + StaticFileSegment::Receipts => Some(PruneSegment::Receipts), + _ => None, + }; + let effective_available_block = if let Some(ps) = prune_segment { + let prune_checkpoint_block = + provider.get_prune_checkpoint(ps)?.and_then(|c| c.block_number).unwrap_or(0); + std::cmp::max(highest_static_file_block, prune_checkpoint_block) + } else { + highest_static_file_block + }; - if checkpoint_block_number > highest_static_file_block { + if checkpoint_block_number > effective_available_block { info!( target: "reth::providers::static_file", checkpoint_block_number, - unwind_target = highest_static_file_block, + unwind_target = effective_available_block, ?segment, "Setting unwind target." ); - return Ok(Some(highest_static_file_block)); + return Ok(Some(effective_available_block)); }29 insertions, 6 deletions, one file. Identical in behavior to the
ensure_invariantsportion of PR #21636 — I did NOT port theensure_invariants_from_dbportion of #21636 since v2.0.0 renamed that function toensure_changeset_invariants_by_blockand my bug doesn't exercise it. For a full upstream fix, that portion should also be ported.Proposed fixes (in order of preference)
Option 1 (preferred): Reopen PR #21636 or cherry-pick its logic into main. The work is already done and I've verified it works on v2.0.0. PR #21636's approach (use
max(highest_static_file_block, prune_checkpoint_block)as the effective "data available from" threshold) is cleaner than a skip-path extension because it also handles theReceiptsvariant of the same bug and any future prune-aware segment.Option 2: Extend PR #21918's skip path to cover distance-pruned senders. Add a
Distancebranch tois_segment_fully_pruned(or introduceis_segment_in_non_static_file_storage). Simpler than #21636 but narrower — only fixes TransactionSenders, not Receipts.Option 3: Migrate prune config from
reth.tomlto MDBXPruneCheckpointsbefore runningcheck_consistency. Lets an operator recover by changing--prune.sender-recovery.distance→--prune.sender-recovery.fullin their config and restarting, without hand-patching the binary. Useful as an escape hatch independent of the main fix.Option 4: Add
--skip-consistency-checkstoreth db stage-checkpoints setandreth stage unwind. Gives operators a manual escape hatch to repair MDBX state without rebuilding reth. Currently--skip-consistency-checksexists only onreth db statsand is read-only. Even a one-timereth db stage-checkpoints set --skip-consistency-checks --stage sender-recovery --block-number 0would have let me recover without the ~12 hours of diagnostic + rebuild work this session took.Option 5 (minimum): Improve diagnostics. The current error —
assertion left != right failedwith both sides equal to0— gives operators nothing to act on. A log line that says "TransactionSenders static files are missing but the SenderRecovery stage checkpoint is at block N; if you are using distance pruning, this is likely a known bug — see issue #XXXXX" would meaningfully reduce diagnostic burden.Environment
eb4c15e5, built 2026-04-07--storage.v2)Full
reth nodeinvocation (from startup script):Confirmed not a config typo or one-off: two independent snapshot downloads on the same hardware reproduced the same panic at the same assertion. The trigger is the first prune cycle or segment rotation after startup, not random corruption.