Skip to content

opt(sync): Bounded look-ahead depth for the PFN/VFN fast-sync (recovery) execute pipeline#762

Open
AshinGau wants to merge 1 commit into
Galxe:mainfrom
AshinGau:main
Open

opt(sync): Bounded look-ahead depth for the PFN/VFN fast-sync (recovery) execute pipeline#762
AshinGau wants to merge 1 commit into
Galxe:mainfrom
AshinGau:main

Conversation

@AshinGau

@AshinGau AshinGau commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Summary

Speeds up PFN/VFN fast-sync (catch-up) block application by removing the lockstep between
consensus and execution in the recovery path, and parallelizing block-retrieval verification on the
fetch path. The new look-ahead behavior is opt-in via a config field and defaults to the original
strictly-serial behavior, so existing nodes are unchanged unless tuned.

On the internal test box (8 cores, gap = 8192, from block 0, 10K→150K window) sustained catch-up
throughput improved from ~496 blk/s (serial) to ~630 blk/s at lookahead=4 (the knee), up to
~663 blk/s at lookahead=64, with 0 state-root mismatches across the sweep.

Motivation

During fast-forward sync the recovery apply path fed execution strictly one block at a time:

set_ordered_blocks → await get_executed_res → state-root check → set_commit_blocks → await persist

Consensus and execution never overlapped (reth_blockchain_tree_in_mem_state_num_blocks ≈ 0, the
executor sat idle waiting for the next feed). This capped catch-up throughput well below execution
capacity, while the live validator path was already pipelined.

What's in this PR

  1. Bounded look-ahead pipeline for the recovery apply path (block_storage/block_store.rs).
    The recovery branch now feeds up to K blocks to execution before draining results, so
    consensus→execution runs continuously instead of lockstep-per-block. set_commit_blocks,
    per-block persist ack, and the per-block state-root comparison are preserved
    ; only the feed is
    pipelined. recover_blocks commits up to the highest committable QC in a single
    send_for_execution call so the pipeline can look ahead across the whole path.
    send_for_execution was also refactored from one ~240-line function into a thin dispatcher plus
    focused helpers (commit_recovery_path / pipeline_recovery_blocks / drain_recovery_block /
    commit_live_path).

  2. Parallelize block-retrieval signature verification
    (consensus-types/src/block_retrieval.rs). BlockRetrievalResponse::verify split the cheap
    sequential chain-linkage check from the per-block BLS aggregate verify (~1.8 ms/block) and runs
    the latter with rayon. A single fast-sync response can carry up to 1000 blocks, so the serial
    verify was the dominant per-chunk cost on the requester.

  3. Pipeline fast-sync chunk verify off the fetch loop (block_storage/sync_manager.rs,
    network.rs). request_block gains a verify: bool flag; retrieve_block_for_id fetches each
    chunk unverified and verifies it on a blocking thread via a bounded (depth-4) pipeline, so chunk
    N's verify overlaps chunk N+1's fetch. Every chunk is still verified before its blocks are
    returned/used
    ; a failed verify bails before use. The zero-start-id (epoch-anchored first chunk)
    guard is unchanged.

  4. New config knob consensus.fast_sync_execute_lookahead (see Configuration). Replaces the
    prior FAST_SYNC_EXECUTE_LOOKAHEAD environment variable. Threaded from ConsensusConfig through
    epoch_manager into BlockStore.

  5. Dependency bumps

    • gaptos e9544c8 → b9e5c60 — adds the fast_sync_execute_lookahead field to ConsensusConfig
      (gravity-aptos change; config-crate-only, api-types unchanged).
    • greth (gravity-reth) 1aec7b75 → 6ea2ba32.

Configuration

The look-ahead depth is a node-local consensus setting, a sibling of max_blocks_per_receiving_request.
Set it in the node's config YAML (e.g. public_full_node.yaml) under consensus::

consensus:
  # Bounded look-ahead depth for the fast-sync (recovery) execute pipeline.
  fast_sync_execute_lookahead: 4
  • Type / default: usize, default 1. The field is #[serde(default)], so omitting it keeps
    the default — existing configs need no change.
  • 1 = original strictly-serial recovery (byte-for-byte the previous behavior).
  • 48 recommended for PFN/VFN catch-up nodes (4 is the throughput knee; higher adds
    diminishing returns and more in-memory pipeline depth / memory use — avoid very large values on
    memory-constrained hosts).
  • Scope: affects the recovery / fast-sync path only; the live validator path is untouched.
  • No environment variable is involved anymore.

Requires the gaptos bump in this PR (b9e5c60) which defines the field. On memory-constrained
catch-up nodes also set the reth persist gap (--gravity.cache.max-persist-gap) appropriately;
that is the dominant single lever for catch-up throughput and is configured separately on the reth
side.

Correctness

The look-ahead pipelines the feed, not the commit. Safety is preserved because:

  • Ordering: blocks are fed in path order (parent before child) and drained FIFO, so commits and
    persists happen strictly in block order.
  • Per-block state-root gate: every drained block compares its computed state root against the
    expected hash from the ledger DB (drain_recovery_block). Any execution divergence surfaces here
    as a graceful error (ensure!) instead of silently committing a bad root.
  • Execution dependency: block N+1 executes on block N's in-memory post-state (the executor's
    in-memory state chain) — the same mechanism the live validator path already relies on; feeding
    ahead does not change execution inputs.
  • lookahead = 1 is the original behavior (feed→drain→feed→drain), and the recovery-vs-live
    dispatch keeps the live path identical.
  • recover_blocks highest-QC jump is end-state equivalent: commit_callback prunes up to the
    last block and raises highest_commit_cert monotonically, and all committable QCs are in the same
    epoch as the latest ledger info, so one call reaches the same end state as the old per-QC sequence.

Benchmarks

K-sweep, from block 0, 10K→150K window, max-persist-gap=8192, internal test box (8 cores):

fast_sync_execute_lookahead block tps in_mem depth state-root mismatches
1 (serial) ~496 ~53 0
4 (knee) ~630 ~147 0
16 ~645 ~153 0
64 ~663 ~160 0

Beyond the knee, throughput is bounded by per-block execution (merklize / state-root), not the
consensus→execution handoff. Validated at small state (0–~200K); high-state validation is follow-up.

Testing

  • Existing consensus unit/fuzz tests updated for the new BlockStore constructor argument
    (round_manager_test, round_manager_fuzzing, test_utils), all passing 1 (serial).
  • Manual catch-up runs on the internal test box (local seed → catch-up node), measuring
    reth_blockchain_tree_canonical_chain_height over time at lookahead = 1/4/16/64.

Notes / follow-ups

  • A state-root mismatch during recovery is currently a graceful, non-fatal error (logged with
    expected vs computed). A mismatch is a deterministic corruption signal that a restart will
    re-hit at the same block; consider making it fatal so the node fails fast rather than continuing
    with in-flight, uncommitted blocks left in the buffer. (In-flight uncommitted blocks are
    in-memory only and are safely discarded on restart.)
  • High-state (multi-million block) validation of the look-ahead is still pending.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant