Skip to content

Fix: Repair unclaimable-payment recovery in PP_Queue_v1#780

Open
marvinkruse wants to merge 1 commit into
devfrom
fix/pp-queue-unclaimable-recovery
Open

Fix: Repair unclaimable-payment recovery in PP_Queue_v1#780
marvinkruse wants to merge 1 commit into
devfrom
fix/pp-queue-unclaimable-recovery

Conversation

@marvinkruse

Copy link
Copy Markdown
Member

Summary

Repairs the unclaimable-payment recovery paths in the queue payment processor. When a direct payment to a recipient fails (e.g. the recipient is blacklisted on a blacklistable ERC20), the processor escrows the tokens inside the module and records them under _unclaimableAmountsForRecipient. Both routes out of that escrow bucket were broken:

  • Wrong transfer primitive (both recovery paths). claimPreviouslyUnclaimable and claimPreviouslyUnclaimableToTreasury released the escrowed tokens with safeTransferFrom(address(this), dest, amount). For a standard ERC20 this spends allowance[address(this)][address(this)], which is never set, so the call reverts. The module already holds the tokens (they were pulled into it on the failed-transfer path), so the correct primitive is safeTransfer(dest, amount).
  • Wrong accounting key (user-claim path). The public claimPreviouslyUnclaimable guarded on the caller's balance (unclaimable(client, token, _msgSender())) but the internal _claimPreviouslyUnclaimable read and deleted the slot of the passed receiver_ argument. Claiming to a replacement receiver therefore passed the guard on the caller's slot while paying out from the replacement's empty slot, transferring nothing and leaving the original balance stranded. The internal function now keys on _msgSender() and uses the argument purely as the payout destination, matching the canonical PP_Simple_v2.

PP_Queue_ManualExecution_v1 inherits these functions unchanged, so both deployed variants are fixed by this change. amountPaid is intentionally not re-called in recovery: the queue already settles client accounting when it escrows the funds, so a second call would double-count.

Context

Reported privately by an external security researcher. No live funds are at risk today ($0 currently sits in the affected escrow bucket); the defect is latent and only bites once a payment lands in the failed-transfer fallback. The fix was independently verified by an agent review and a Codex second opinion.

Changes

  • src/modules/paymentProcessor/PP_Queue_v1.sol: safeTransferFrom(address(this), ...) to safeTransfer(...) in both recovery paths; key the internal user-claim on _msgSender().
  • test/unit/modules/paymentProcessor/PP_Queue_v1.t.sol: three regression tests covering both defects (each fails on the pre-fix code), plus an updated internal-claim test that no longer encodes the buggy behavior.

Deployment Steps

⚠️ Manual deployment steps needed, this is a beacon-backed module upgrade and is not live until the multisig finalizes it on the target network.

This upgrades two separate module beacons: PP_Queue_v1 and PP_Queue_ManualExecution_v1. Run every step once per module (substitute the module name).

  • Step 1: Merge to the target branch and build the implementations.
    forge build
  • Step 2: Deploy the new implementation deterministically. On mainnet the script deploys the implementation and then prints the beacon, implementation, version and governor addresses rather than touching the beacon (the beacon set is multisig-gated). Run once for each module, bumping the minor version.
    forge script script/utils/UpgradeModule.s.sol \
      "run(string,uint)" "PP_Queue_v1" 1 \
      --rpc-url "$RPC_URL" --broadcast
    
    forge script script/utils/UpgradeModule.s.sol \
      "run(string,uint)" "PP_Queue_ManualExecution_v1" 1 \
      --rpc-url "$RPC_URL" --broadcast
    Record the printed beacon, implementation and version (minor, patch) for each module.
  • Step 3: From the community multisig, start the timelocked beacon upgrade on the Governor for each beacon, using the values from Step 2.
    Governor_v1.upgradeBeaconWithTimelock(
      beacon,            // printed in Step 2
      newImplementation, // printed in Step 2
      newMinorVersion,   // printed in Step 2
      newPatchVersion    // printed in Step 2
    )
    
  • Step 4: Wait out the Governor timelock period, then from the same multisig finalize the upgrade for each beacon.
    Governor_v1.triggerUpgradeBeaconWithTimelock(beacon)
    
    If an immediate upgrade is required and authorized, the community multisig can instead call forceUpgradeBeaconAndRestartImplementation(beacon, newImplementation, newMinorVersion, newPatchVersion), which skips the timelock.

After both beacons are upgraded, verify each beacon.implementation() matches the implementation from Step 2 and beacon.version() reports the new minor version.

Test Plan

Pre-merge checklist

  • forge build succeeds
  • forge test --match-contract PP_Queue_v1_Test passes
  • forge test --match-contract PP_Queue_ManualExecution_v1 passes
  • New regression tests fail on the pre-fix code and pass on the fix

Post-deployment verification

  • On a mainnet fork or testnet, drive a payment into the failed-transfer escrow path and confirm claimPreviouslyUnclaimable releases the funds, including to a replacement receiver distinct from the caller.
  • Confirm claimPreviouslyUnclaimableToTreasury sweeps an escrowed balance to the failed-orders treasury.
  • After the multisig upgrade, confirm both beacons report the new implementation and version.

The queue payment processor escrows funds into the module itself when a
direct payment to the recipient fails. Both recovery paths were unable to
release those escrowed funds:

- They released tokens with safeTransferFrom(address(this), ...), which
  requires a self-allowance that is never set, so the call reverts for
  standard ERC20s. The module already holds the tokens, so use safeTransfer.

- The user-claim path guarded on the caller's unclaimable balance but the
  internal _claimPreviouslyUnclaimable keyed on the passed receiver argument.
  Claiming to a replacement receiver therefore read an empty slot and left
  the original balance stranded. The internal function now keys on
  _msgSender() and uses the argument only as the payout destination.

Inherited unchanged by PP_Queue_ManualExecution_v1, so both deployed
variants are fixed. Adds regression tests that fail on the pre-fix code.
@marvinkruse marvinkruse self-assigned this Jun 23, 2026
@marvinkruse

Copy link
Copy Markdown
Member Author

We deployed this to Ethereum mainnet via community-multisig upgrade, so the live fix is in ahead of the merge. Quick rundown of what happened.

Scope ended up smaller than the PR text implies: only PP_Queue_ManualExecution_v1 is actually registered on mainnet (and only on Ethereum, nothing on the other chains). PP_Queue_v1 isn't deployed anywhere, so there was just the one beacon to upgrade. I checked this against the ModuleFactory on all the mainnets, not just the deployer/metadata, so i'm fairly confident on that.

What i did:

Verified after the upgrade that the beacon implementation points to the new impl and that Piku's proxy 0x5A2d08b194E1764b0Ff271C691B6a46fA10F6Fd2 now resolves to the v1.1.0 fixed code.

So the chain side is done, this PR is just to keep the source in sync, can be merged to dev whenever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant