Collator stability fixes#214
Merged
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR addresses multiple collator/validator stability and parity issues across the VM, block primitives, Simplex consensus (restart recovery + invariant hardening + repair validation), and ADNL transport (RLDP/QUIC resiliency + metrics).
Changes:
- Fix VM
CDEPTHto report represented depth for Merkle-proof pruned-branch cells and add coverage. - Harden validator/collator and Simplex restart/recovery paths (skip/final cert replay, skipscan non-panicking fallbacks, pruned dispatch-queue handling) and tighten
requestCandidaterepair validation/merging. - Improve ADNL RLDP/QUIC robustness (worker pool, reconnect regression tests, size caps, counters exported via
metrics) and update versions/docs/changelog.
Reviewed changes
Copilot reviewed 38 out of 61 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/vm/tests/test_cell_serialization.rs | Adds regression test for CDEPTH on pruned-branch cells in Merkle proofs. |
| src/vm/src/executor/serialization.rs | Changes CDEPTH to always use stored/represented depth (fixes pruned-branch behavior). |
| src/node/src/validator/validator_group.rs | Alters candidate-observation handling and resolver-cache updates in validator group. |
| src/node/src/validator/validate_query.rs | Treats pruned dispatch-queue access as a “have unprocessed queue” condition instead of hard error. |
| src/node/src/validator/tests/test_validator_group.rs | Updates test to feed a real deserializable block body consistent with observed block id. |
| src/node/src/validator/tests/test_collator.rs | Adds bundle-driven regression tests for pruned dispatch queue / removed cells in proof. |
| src/node/src/validator/accept_block.rs | Gates top-shard-descr promotion on signature finality for Simplex. |
| src/node/src/tests/test_signature.rs | Adds tests for top-shard-descr promotion rules (ordinary vs final/notarized Simplex). |
| src/node/src/tests/static/0.e000000000000000_67704975_6cea4ad1_collator_test_bundle/index.json | Adds a new static collator test bundle index. |
| src/node/src/tests/static/0.8000000000000000_76770988_db9fc78e_collator_test_bundle/index.json | Adds a new static collator test bundle index. |
| src/node/simplex/src/tests/test_simplex_state.rs | Adds extensive regression coverage for restart parity, blocker recovery, skipscan fallback semantics. |
| src/node/simplex/src/tests/test_restart.rs | Extends restart recovery tests for final/skip cert replay ordering and listener hooks. |
| src/node/simplex/src/tests/test_receiver.rs | Improves receiver tests to use valid leader signatures and quorum notar bytes for repair paths. |
| src/node/simplex/src/tests/test_candidate_resolver.rs | Adds tests for merge semantics and requestCandidate response validation (identity + notar signatures). |
| src/node/simplex/src/startup_recovery.rs | Adds startup replay mode + restores persisted SkipCert/FinalCert before restart skip generation. |
| src/node/simplex/src/simplex_state.rs | Moves diagnostics types, adds startup replay gating, makes skipscan non-panicking with bounded fallback. |
| src/node/simplex/src/receiver.rs | Validates repair responses before merge/cache; preserves C++-like partial merge semantics safely. |
| src/node/simplex/README.md | Updates version/semantics notes and documents new stability/async DB/restart/observer behaviors. |
| src/node/simplex/CHANGELOG.md | Adds 0.7.1 release notes capturing the parity/stability work. |
| src/node/simplex/Cargo.toml | Bumps simplex crate version to 0.7.1. |
| src/node/consensus-common/src/tests/test_async_key_value_storage.rs | Updates tests to assert the new typed “already taken” sentinel behavior. |
| src/node/consensus-common/src/lib.rs | Introduces StorageResultAlreadyTaken typed sentinel and documents downcast-based detection. |
| src/node/consensus-common/src/async_key_value_storage.rs | Emits StorageResultAlreadyTaken instead of stringly-typed errors on taken results. |
| src/node/Cargo.toml | Bumps node crate version to 0.9.0. |
| src/Cargo.lock | Updates lockfile for version bumps and added deps (e.g., metrics). |
| src/block/src/storage_stat.rs | Ensures removed cells are marked visited in UsageTree so proofs include removals. |
| src/block/src/dictionary/mod.rs | Hardens dictionary label reading/iteration against pruned cells; propagates pruned access as PrunedCellAccess. |
| src/block/src/dictionary/hashmapaug.rs | Adds/notes diff-scanning helper doc comment in macro-generated API. |
| src/adnl/tests/test_quic.rs | Adds reconnect regression test and updates imports/utilization. |
| src/adnl/src/rldp/send.rs | Adds SendWorkerPool and refactors send paths/state to reduce coupling and improve pacing. |
| src/adnl/src/rldp/recv.rs | Adjusts RLDP total-size cap to account for TL overhead. |
| src/adnl/src/rldp/mod.rs | Refactors outbound job scheduling/execution and aligns closed-transfer behavior; tracks extended caps separately. |
| src/adnl/src/quic/stat.rs | Introduces event counters with per-dump windows + cumulative Prometheus counters via metrics. |
| src/adnl/src/quic/mod.rs | Uses new counters, pre-registers metrics, and adjusts sender-task connect-failure/flush/retry behavior. |
| src/adnl/src/adnl/node.rs | Adds periodic yields in heavy/expired/non-channeled paths to reduce worker-thread monopolization. |
| src/adnl/Cargo.toml | Adds metrics dependency for QUIC transport metrics. |
| src/adnl/benches/bench_rldp.rs | Tweaks debug loss simulation and timeout to use probabilistic drop. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ion recovery Address Copilot review on #214: - receiver: reject empty / zero-weight validator set in validate_repair_notar_signature_set before the threshold_66 quorum gate, since threshold_66(0) == 0 would otherwise accept a signature-less notar response. - validator_group: on candidate-body deserialize failure, record the observation flags-only (block = None) instead of dropping it, so flag updates can OR-merge and a later valid body can overwrite the entry instead of stranding resolver waiters. Add regression tests for both paths.
mnogoborec
approved these changes
Jun 12, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.