opt(sync): Bounded look-ahead depth for the PFN/VFN fast-sync (recovery) execute pipeline by AshinGau · Pull Request #762 · Galxe/gravity-sdk

AshinGau · 2026-06-30T04:21:19Z

Summary

Speeds up PFN/VFN fast-sync (catch-up) block application by removing the lockstep between
consensus and execution in the recovery path, and parallelizing block-retrieval verification on the
fetch path. The new look-ahead behavior is opt-in via a config field and defaults to the original
strictly-serial behavior, so existing nodes are unchanged unless tuned.

On the internal test box (8 cores, gap = 8192, from block 0, 10K→150K window) sustained catch-up
throughput improved from ~496 blk/s (serial) to ~630 blk/s at lookahead=4 (the knee), up to
~663 blk/s at lookahead=64, with 0 state-root mismatches across the sweep.

Motivation

During fast-forward sync the recovery apply path fed execution strictly one block at a time:

set_ordered_blocks → await get_executed_res → state-root check → set_commit_blocks → await persist

Consensus and execution never overlapped (reth_blockchain_tree_in_mem_state_num_blocks ≈ 0, the
executor sat idle waiting for the next feed). This capped catch-up throughput well below execution
capacity, while the live validator path was already pipelined.

What's in this PR

Bounded look-ahead pipeline for the recovery apply path (block_storage/block_store.rs).
The recovery branch now feeds up to K blocks to execution before draining results, so
consensus→execution runs continuously instead of lockstep-per-block. set_commit_blocks,
per-block persist ack, and the per-block state-root comparison are preserved; only the feed is
pipelined. recover_blocks commits up to the highest committable QC in a single
send_for_execution call so the pipeline can look ahead across the whole path.
send_for_execution was also refactored from one ~240-line function into a thin dispatcher plus
focused helpers (commit_recovery_path / pipeline_recovery_blocks / drain_recovery_block /
commit_live_path).
Parallelize block-retrieval signature verification
(consensus-types/src/block_retrieval.rs). BlockRetrievalResponse::verify split the cheap
sequential chain-linkage check from the per-block BLS aggregate verify (~1.8 ms/block) and runs
the latter with rayon. A single fast-sync response can carry up to 1000 blocks, so the serial
verify was the dominant per-chunk cost on the requester.
Pipeline fast-sync chunk verify off the fetch loop (block_storage/sync_manager.rs,
network.rs). request_block gains a verify: bool flag; retrieve_block_for_id fetches each
chunk unverified and verifies it on a blocking thread via a bounded (depth-4) pipeline, so chunk
N's verify overlaps chunk N+1's fetch. Every chunk is still verified before its blocks are
returned/used; a failed verify bails before use. The zero-start-id (epoch-anchored first chunk)
guard is unchanged.
New config knob consensus.fast_sync_execute_lookahead (see Configuration). Replaces the
prior FAST_SYNC_EXECUTE_LOOKAHEAD environment variable. Threaded from ConsensusConfig through
epoch_manager into BlockStore.
Dependency bumps
- gaptos e9544c8 → b9e5c60 — adds the fast_sync_execute_lookahead field to ConsensusConfig
  (gravity-aptos change; config-crate-only, api-types unchanged).
- greth (gravity-reth) 1aec7b75 → 6ea2ba32.

Configuration

The look-ahead depth is a node-local consensus setting, a sibling of max_blocks_per_receiving_request.
Set it in the node's config YAML (e.g. public_full_node.yaml) under consensus::

consensus:
  # Bounded look-ahead depth for the fast-sync (recovery) execute pipeline.
  fast_sync_execute_lookahead: 4

Type / default: usize, default 1. The field is #[serde(default)], so omitting it keeps
the default — existing configs need no change.
1 = original strictly-serial recovery (byte-for-byte the previous behavior).
4–8 recommended for PFN/VFN catch-up nodes (4 is the throughput knee; higher adds
diminishing returns and more in-memory pipeline depth / memory use — avoid very large values on
memory-constrained hosts).
Scope: affects the recovery / fast-sync path only; the live validator path is untouched.
No environment variable is involved anymore.

Requires the gaptos bump in this PR (b9e5c60) which defines the field. On memory-constrained
catch-up nodes also set the reth persist gap (--gravity.cache.max-persist-gap) appropriately;
that is the dominant single lever for catch-up throughput and is configured separately on the reth
side.

Correctness

The look-ahead pipelines the feed, not the commit. Safety is preserved because:

Ordering: blocks are fed in path order (parent before child) and drained FIFO, so commits and
persists happen strictly in block order.
Per-block state-root gate: every drained block compares its computed state root against the
expected hash from the ledger DB (drain_recovery_block). Any execution divergence surfaces here
as a graceful error (ensure!) instead of silently committing a bad root.
Execution dependency: block N+1 executes on block N's in-memory post-state (the executor's
in-memory state chain) — the same mechanism the live validator path already relies on; feeding
ahead does not change execution inputs.
lookahead = 1 is the original behavior (feed→drain→feed→drain), and the recovery-vs-live
dispatch keeps the live path identical.
recover_blocks highest-QC jump is end-state equivalent: commit_callback prunes up to the
last block and raises highest_commit_cert monotonically, and all committable QCs are in the same
epoch as the latest ledger info, so one call reaches the same end state as the old per-QC sequence.

Benchmarks

K-sweep, from block 0, 10K→150K window, max-persist-gap=8192, internal test box (8 cores):

`fast_sync_execute_lookahead`	block tps	`in_mem` depth
1 (serial)	~496	~53
4 (knee)	~630	~147
16	~645	~153
64	~663	~160

Beyond the knee, throughput is bounded by per-block execution (merklize / state-root), not the
consensus→execution handoff. Validated at small state (0–~200K); high-state validation is follow-up.

Testing

Existing consensus unit/fuzz tests updated for the new BlockStore constructor argument
(round_manager_test, round_manager_fuzzing, test_utils), all passing 1 (serial).
Manual catch-up runs on the internal test box (local seed → catch-up node), measuring
reth_blockchain_tree_canonical_chain_height over time at lookahead = 1/4/16/64.

Notes / follow-ups

A state-root mismatch during recovery is currently a graceful, non-fatal error (logged with
expected vs computed). A mismatch is a deterministic corruption signal that a restart will
re-hit at the same block; consider making it fatal so the node fails fast rather than continuing
with in-flight, uncommitted blocks left in the buffer. (In-flight uncommitted blocks are
in-memory only and are safely discarded on restart.)
High-state (multi-million block) validation of the look-ahead is still pending.

…ry) execute pipeline

AshinGau force-pushed the main branch from 6523917 to aee5fd7 Compare June 30, 2026 05:03

opt(sync): Bounded look-ahead depth for the PFN/VFN fast-sync (recove…

4badcc7

…ry) execute pipeline

AshinGau force-pushed the main branch from aee5fd7 to 4badcc7 Compare June 30, 2026 06:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

opt(sync): Bounded look-ahead depth for the PFN/VFN fast-sync (recovery) execute pipeline#762

opt(sync): Bounded look-ahead depth for the PFN/VFN fast-sync (recovery) execute pipeline#762
AshinGau wants to merge 1 commit into
Galxe:mainfrom
AshinGau:main

AshinGau commented Jun 30, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

AshinGau commented Jun 30, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

What's in this PR

Configuration

Correctness

Benchmarks

Testing

Notes / follow-ups

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

AshinGau commented Jun 30, 2026 •

edited

Loading