Skip to content

fix: serialize and confirm nonce usage to avoid gaps and collisions#5

Open
vaab wants to merge 3 commits into
com-chain:masterfrom
0k:0.3
Open

fix: serialize and confirm nonce usage to avoid gaps and collisions#5
vaab wants to merge 3 commits into
com-chain:masterfrom
0k:0.3

Conversation

@vaab

@vaab vaab commented Jun 8, 2026

Copy link
Copy Markdown
Collaborator

Summary

Fixes nonce mismanagement in send_transaction that caused
transactions to be acknowledged by geth then evicted without being
mined (observed in production on the Léman currency, multi-worker Odoo
setup).

Root cause: PyC3L layered its own pending-nonce tracker
(update_nonce / _additional_nonce / hasChangedBlock) on top
of geth's pending nonce, which already tracks in-flight
transactions. The two double-counted (producing gaps), and with no
cross-process lock, concurrent workers could grab the same nonce
(collisions).

Changes

  • Rework send_transaction into a per-account critical section:
    • fcntl.flock lock keyed on the account (same-host cross-process
      mutual exclusion), polled so contention is logged and bounded
      (NonceLockTimeout).
    • Trust geth's pending nonce verbatim (no local arithmetic).
    • Pin the serialized chain of sends to a single node (stored in the
      lock file) to avoid cross-node propagation lag of the node-local
      pending nonce; pin stays warm via file mtime
      (NONCE_NODE_PIN_TTL) and re-elects once idle.
    • Wait until geth acknowledges the tx (visible in pool or a block)
      before releasing, logging wait duration on success and timeout
      (TransactionUnconfirmed).
  • Remove the obsolete update_nonce, _additional_nonce,
    _current_block, hasChangedBlock, registerCurrentBlock
    (no external callers).
  • Add tests/test_nonce.py (11 tests): nonce trust, gap avoidance,
    lock mutual exclusion, node pinning (warm reuse / cold re-election,
    forced endpoint honored only when cold), confirmation, lock-wait
    visibility and timeout.

Notes

This PR targets master because no 0.3 branch exists on the
canonical repo; it carries the nonce fix plus two pre-existing 0.3
commits.

vaab added 3 commits March 26, 2026 14:09
``send_transaction`` previously layered its own pending-nonce tracker
(``update_nonce`` / ``_additional_nonce`` / ``hasChangedBlock``) on top
of geth's ``pending`` nonce -- which already tracks in-flight
transactions.  The two trackers double-counted, producing nonce gaps,
and with no cross-process lock, concurrent Odoo workers could grab the
same nonce.  Both led to transactions being acked by geth then evicted
without being mined.

Rework ``send_transaction`` into a per-account critical section that:

- acquires an ``fcntl.flock`` lock keyed on the account (same-host
  cross-process mutual exclusion), with polled acquisition so
  contention is logged and bounded (``NonceLockTimeout``);
- trusts geth's ``pending`` nonce verbatim (no local arithmetic);
- pins the whole serialized chain of sends to a single node, stored in
  the lock file, to dodge cross-node propagation lag of the node-local
  pending nonce; the pin stays warm via the file mtime
  (``NONCE_NODE_PIN_TTL``) and re-elects once idle;
- waits until geth acknowledges the transaction (visible in pool or a
  block) before releasing, logging the wait duration on both success
  and timeout (``TransactionUnconfirmed``).

Remove the now-obsolete ``update_nonce``, ``_additional_nonce``,
``_current_block``, ``hasChangedBlock`` and ``registerCurrentBlock``
(verified: no external callers).

Add ``tests/test_nonce.py`` covering nonce trust, gap avoidance, lock
mutual exclusion, node pinning (warm reuse / cold re-election, forced
endpoint honored only when cold), confirmation, and the lock-wait
visibility and timeout paths.

Assisted-by: Claude:claude-opus-4-8
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant