Fix cluster pub/sub eviction and re-home races#155
Merged
Conversation
…e miscount H3: shard-connection eviction snapshotted empty connections under the lock but closed them outside it, so a concurrent place/reconcile holding a stale connection reference could bind a sink in that window; the close then terminated the just-registered sink while awaitActive returned normally on Closed, recording a phantom placement with no retry. SubscriptionConnection gains closeIfEmpty, which decides emptiness and the Closed transition in one critical section: a racing attach either registers first (connection kept) or observes Closed and fails before registering, so eviction never terminates a sink an in-flight attach just bound. M2: rehomeClassic nulled classicConn on partial failure without closing the connection, which could be Live carrying re-attached subs — duplicate deliveries on two live sockets plus a leaked socket and watchdog. It now calls a new shutdown that closes the socket/watchdog without terminating the manager-owned sinks, so the retry re-homes them onto a fresh connection. M3: fullyPlaced summed per-node set sizes, so a channel double-recorded on two nodes masked another that never landed. It now counts the distinct union of placed channels. A contributing bug: an error reply to SSUBSCRIBE (e.g. MOVED) was misrouted by onFrame as a bootstrap/PONG reply and swallowed, recording an unconfirmed subscribe as placed; an unexpected error frame now drops the connection so the manager re-homes or reconnects. Regression tests in PlacementSpec and SubscriptionConnectionSpec.
535851d to
4fe6ffa
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Three cluster pub/sub concurrency bugs in the shard-subscription manager and its placement ledger.
H3 — eviction racing an in-flight subscribe silently kills the stream (reported healthy)
evictEmptyShardConnssnapshotted empty connections under the lock but closed them outside it. A concurrentplace/reconcileholding a stale connection reference (from an earlierensure) could bind a sink in that window; the close then terminated the just-registered sink (stream ends as if complete) whileawaitActivereturned normally onClosed(cluster attaches don't usefailIfUnconfirmed).Placementrecorded the phantom,fullyPlacedpassed, no retry was scheduled — a dead subscription reported healthy.Fix:
SubscriptionConnection.closeIfEmpty()decides emptiness and theClosedtransition in one critical section. A racing attach either registers first (connection stays non-empty, kept) or observesClosedand throws before registering — so its sink is never bound to the dying connection, andplaceleaves it unplaced to retry onto a fresh one.evictEmptyShardConnsre-checks emptiness throughcloseIfEmptyat close time.M2 —
rehomeClassicleaks the dropped connection on partial failureIt nulled
classicConnand retried without closingconn, which by then can be Live carrying successfully re-attached subs: duplicate deliveries on two live sockets, plus a leaked socket and watchdog.Fix:
SubscriptionConnection.shutdown()closes the socket + watchdog without terminating the manager-owned sinks (so the retry re-attaches them;transport.close()interrupts the reader, so a backpressured sink can't hang the join).rehomeClassiccalls it before retrying.M3 —
fullyPlacedsummed per-node set sizesA channel double-recorded on two nodes (possible because
placecomputes its plan outside the reconcile single-flight, against a stale topology) inflated the sum, masking another channel that never landed — stranded with no retry.Fix: count the distinct union of placed channels. The double-placement then self-heals on the next reconcile.
Contributing bug (also fixed): an error reply to
SSUBSCRIBE(e.g.MOVED) was misrouted byonFrameas a bootstrap/PONG reply and silently swallowed, so an unconfirmed subscribe was recorded as placed. A non-bootstrap error frame now drops the connection (off the reader thread) so the manager re-homes (cluster) or reconnects (standalone). Note: this also changes standalone behavior — an unexpected error on a subscription socket, previously ignored, now triggers a reconnect; a post-bootstrap error there is always abnormal.Tests
PlacementSpec:fullyPlacedcounts distinct channels, so a double-recorded one can't mask an unplaced channel.SubscriptionConnectionSpec:closeIfEmptykeeps a connection holding a sink / closes once empty / rejects a racing attach;shutdowndrops the socket but leaves the sink usable for re-attach; an unexpected error reply drops the connection instead of being swallowed.All six client backend cells compile; all 141
sage.client.internalunit tests pass; scalafmt clean. The M3 and MOVED tests were confirmed to fail against the pre-fix code.