[WebPubSub] Fix Web PubSub client ack race and the live tests#48989
Open
[WebPubSub] Fix Web PubSub client ack race and the live tests#48989
Conversation
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR improves reliability of the Web PubSub Java client by eliminating an ack-handling race in the WebSocket client flow and by stabilizing live-test infrastructure/setup for CI.
Changes:
- Fix ack-based operations to start listening for ack messages before sending the WebSocket operation message.
- Stabilize live tests by improving close/lifecycle synchronization and reusing a single
WebPubSubServiceClientin tests. - Update test resource provisioning to include a Socket.IO Web PubSub resource and required RBAC role assignments for AAD-based live tests.
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| sdk/webpubsub/test-resources.bicep | Adds Socket.IO Web PubSub test resource and assigns additional RBAC roles needed for AAD live tests. |
| sdk/webpubsub/azure-messaging-webpubsub/src/test/java/com/azure/messaging/webpubsub/TestUtils.java | Updates default test endpoints to use https. |
| sdk/webpubsub/azure-messaging-webpubsub-client/src/test/java/com/azure/messaging/webpubsub/client/TestUtils.java | Adjusts chained credential order for live-test authentication. |
| sdk/webpubsub/azure-messaging-webpubsub-client/src/test/java/com/azure/messaging/webpubsub/client/TestBase.java | Reuses a static WebPubSubServiceClient across live tests to reduce repeated setup. |
| sdk/webpubsub/azure-messaging-webpubsub-client/src/test/java/com/azure/messaging/webpubsub/client/MockClientTests.java | Adds a regression test for ack ID generation and fast-ack handling via a mock session. |
| sdk/webpubsub/azure-messaging-webpubsub-client/src/test/java/com/azure/messaging/webpubsub/client/ClientTests.java | Makes testClientCloseable wait for actual lifecycle events (connected/stopped/disconnected). |
| sdk/webpubsub/azure-messaging-webpubsub-client/src/main/java/com/azure/messaging/webpubsub/client/implementation/websocket/WebSocketClientHandler.java | Hardens composite buffer publishing/release behavior to avoid using invalid buffers. |
| sdk/webpubsub/azure-messaging-webpubsub-client/src/main/java/com/azure/messaging/webpubsub/client/WebPubSubAsyncClient.java | Introduces sendMessageAndWaitForAck to subscribe for ack before send, and updates ack ID generation. |
| sdk/webpubsub/azure-messaging-webpubsub-client/CHANGELOG.md | Documents the ack race fix in the changelog. |
…om/azure/messaging/webpubsub/client/implementation/websocket/WebSocketClientHandler.java Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
…-for-java into csy/fix-livetest
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Fix Web PubSub client live-test instability by addressing ack handling in the websocket client and updating the live test environment setup.
Subscribe for ack messages before sending websocket operations, so fast service acks cannot be missed.
Root Cause: The Web PubSub client had a race in ack handling. For ack-based operations, the old flow sent the websocket message first and only subscribed to
ackMessageSinkafterward. If the service returned the ack very quickly, the ack could be emitted before the client started waiting for it. SinceackMessageSinkis not a durable ack cache, that ack was missed and the operation eventually timed out withAcknowledge from the service not received.Customer Impact: Customers could see intermittent
SendMessageFailedExceptionfailures from operations such asjoinGroup,leaveGroup,sendToGroup, orsendEvent, even when the service had already processed the request successfully. During reconnect with automatic group restore, this could also surface as an unexpectedRejoinGroupFailedEvent, making the client appear to fail restoring group membership after reconnect.Fix CI
testClientCloseablewait for actual lifecycle events instead of manually counting down latches.All SDK Contribution checklist:
General Guidelines and Best Practices
Testing Guidelines