fix: serialize and confirm nonce usage to avoid gaps and collisions#5
Open
vaab wants to merge 3 commits into
Open
fix: serialize and confirm nonce usage to avoid gaps and collisions#5vaab wants to merge 3 commits into
vaab wants to merge 3 commits into
Conversation
``send_transaction`` previously layered its own pending-nonce tracker (``update_nonce`` / ``_additional_nonce`` / ``hasChangedBlock``) on top of geth's ``pending`` nonce -- which already tracks in-flight transactions. The two trackers double-counted, producing nonce gaps, and with no cross-process lock, concurrent Odoo workers could grab the same nonce. Both led to transactions being acked by geth then evicted without being mined. Rework ``send_transaction`` into a per-account critical section that: - acquires an ``fcntl.flock`` lock keyed on the account (same-host cross-process mutual exclusion), with polled acquisition so contention is logged and bounded (``NonceLockTimeout``); - trusts geth's ``pending`` nonce verbatim (no local arithmetic); - pins the whole serialized chain of sends to a single node, stored in the lock file, to dodge cross-node propagation lag of the node-local pending nonce; the pin stays warm via the file mtime (``NONCE_NODE_PIN_TTL``) and re-elects once idle; - waits until geth acknowledges the transaction (visible in pool or a block) before releasing, logging the wait duration on both success and timeout (``TransactionUnconfirmed``). Remove the now-obsolete ``update_nonce``, ``_additional_nonce``, ``_current_block``, ``hasChangedBlock`` and ``registerCurrentBlock`` (verified: no external callers). Add ``tests/test_nonce.py`` covering nonce trust, gap avoidance, lock mutual exclusion, node pinning (warm reuse / cold re-election, forced endpoint honored only when cold), confirmation, and the lock-wait visibility and timeout paths. Assisted-by: Claude:claude-opus-4-8
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes nonce mismanagement in
send_transactionthat causedtransactions to be acknowledged by
geththen evicted without beingmined (observed in production on the Léman currency, multi-worker Odoo
setup).
Root cause: PyC3L layered its own pending-nonce tracker
(
update_nonce/_additional_nonce/hasChangedBlock) on topof geth's
pendingnonce, which already tracks in-flighttransactions. The two double-counted (producing gaps), and with no
cross-process lock, concurrent workers could grab the same nonce
(collisions).
Changes
send_transactioninto a per-account critical section:fcntl.flocklock keyed on the account (same-host cross-processmutual exclusion), polled so contention is logged and bounded
(
NonceLockTimeout).pendingnonce verbatim (no local arithmetic).lock file) to avoid cross-node propagation lag of the node-local
pending nonce; pin stays warm via file mtime
(
NONCE_NODE_PIN_TTL) and re-elects once idle.before releasing, logging wait duration on success and timeout
(
TransactionUnconfirmed).update_nonce,_additional_nonce,_current_block,hasChangedBlock,registerCurrentBlock(no external callers).
tests/test_nonce.py(11 tests): nonce trust, gap avoidance,lock mutual exclusion, node pinning (warm reuse / cold re-election,
forced endpoint honored only when cold), confirmation, lock-wait
visibility and timeout.
Notes
This PR targets
masterbecause no0.3branch exists on thecanonical repo; it carries the nonce fix plus two pre-existing
0.3commits.