You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
PR #5750 (merged) closed a finality-forgery vulnerability: generic STATE sync was importing finalized epoch data from any peer's snapshot, with only a sender-signature check. A valid peer signature authenticates who sent the snapshot, not that the epoch was actually committed by quorum. #5750 removed that path — epochs now enter epoch_crdt only via the validated EPOCH_COMMIT handler, which requires the receiver to have locally observed accept votes from a quorum.
The gap this creates
A node that was offline during epoch voting + commit has no entries in its local _epoch_votes for that epoch. When it comes back online and a peer sends EPOCH_COMMIT, the new _handle_epoch_commit validation rejects it with unverified_voters — the recovering node never saw the votes, so it can't confirm the quorum.
Result: offline nodes can no longer catch up epoch finality. This is an acceptable short-term tradeoff (closing a forgery hole beats easy catch-up), but it needs a proper fix.
What's needed
A validated catch-up sync path — something that lets a recovering node re-establish finality for missed epochs without trusting an unverified snapshot. Options to evaluate:
Vote-record sync — let a recovering node request the raw signed accept-votes for an epoch from peers, verify each signature itself, then apply the same quorum check _handle_epoch_commit uses. Finality is then locally derived, not imported.
Commit-certificate — have the committing node assemble a quorum certificate (the set of signed accept-votes) and attach it to EPOCH_COMMIT. A recovering node verifies the certificate's signatures + quorum independently.
Checkpoint sync — periodic signed checkpoints with their own multi-sig quorum, for nodes that are very far behind.
Option 2 (commit-certificate) is likely the cleanest — it makes EPOCH_COMMIT self-verifying instead of depending on the receiver's local vote history.
Acceptance
A node offline during epoch N's voting can rejoin and establish finality for epoch N from peer data without any path that accepts unverified finalized flags.
Context
PR #5750 (merged) closed a finality-forgery vulnerability: generic
STATEsync was importingfinalizedepoch data from any peer's snapshot, with only a sender-signature check. A valid peer signature authenticates who sent the snapshot, not that the epoch was actually committed by quorum. #5750 removed that path — epochs now enterepoch_crdtonly via the validatedEPOCH_COMMIThandler, which requires the receiver to have locally observed accept votes from a quorum.The gap this creates
A node that was offline during epoch voting + commit has no entries in its local
_epoch_votesfor that epoch. When it comes back online and a peer sendsEPOCH_COMMIT, the new_handle_epoch_commitvalidation rejects it withunverified_voters— the recovering node never saw the votes, so it can't confirm the quorum.Result: offline nodes can no longer catch up epoch finality. This is an acceptable short-term tradeoff (closing a forgery hole beats easy catch-up), but it needs a proper fix.
What's needed
A validated catch-up sync path — something that lets a recovering node re-establish finality for missed epochs without trusting an unverified snapshot. Options to evaluate:
_handle_epoch_commituses. Finality is then locally derived, not imported.EPOCH_COMMIT. A recovering node verifies the certificate's signatures + quorum independently.Option 2 (commit-certificate) is likely the cleanest — it makes
EPOCH_COMMITself-verifying instead of depending on the receiver's local vote history.Acceptance
finalizedflags.test_signed_state_sync_cannot_inject_epoch_finalityinvariant from fix: ignore epoch finality from state sync #5750 still holds.Filed as a follow-up to keep #5750's security fix shippable now while the catch-up path is designed properly.