Give each subscription connection its own bootstrap waiter#156
Merged
Conversation
Two establish() calls can run concurrently in SubscriptionConnection — a pending attemptReconnect (state Reconnecting) plus a fresh attach after closeOwned reset the state to Idle. Both bootstraps wrote the single shared bootstrapWaiter field, and the shared onFrame routed each connection's HELLO reply to whichever waiter was set last, so one connection's reply could complete the other's bootstrap: a spurious NotConnected from a valid subscribe and a full connect-timeout stall on the cross-completed one. Move bootstrapWaiter onto Conn and give each Conn its own onFrame closure, so a reply routes to the waiter of the connection it arrived on. The unexpected-error drop now closes the specific connection that produced the error rather than current.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Two
establish()calls can run concurrently inSubscriptionConnection— a pendingattemptReconnect(stateReconnecting) plus a freshattachaftercloseOwnedreset the state toIdle. Both bootstraps wrote the single sharedbootstrapWaiterfield, and the sharedonFramerouted each connection's HELLO reply to whichever waiter was set last. So one connection's reply could complete the other's waiter — a spuriousNotConnectedfrom a valid subscribe, and a full connect-timeout stall on the cross-completed one.Fix: move
bootstrapWaiterontoConnand give eachConnits ownonFrameclosure (factory(frame => onFrame(this, frame), …)), so a reply routes to the waiter of the connection it actually arrived on. This removes the shared routing state by construction. The unexpected-error drop (added in #155) now closes the specific connection that produced the error rather thancurrent.No dedicated test: a faithful reproduction needs two concurrent establishes on real threads with deferred bootstrap replies, and which of the two racing establishes gets cross-completed is nondeterministic — the symptom can't be asserted without a flaky, timing-dependent test. The fix is structural (a shared field becomes per-instance). The full
sage.client.internalsuite (141 tests) passes across all six backends; scalafmt clean.