Skip to content

[WebPubSub] Fix Web PubSub client ack race and the live tests#48989

Open
MoChilia wants to merge 17 commits intomainfrom
csy/fix-livetest
Open

[WebPubSub] Fix Web PubSub client ack race and the live tests#48989
MoChilia wants to merge 17 commits intomainfrom
csy/fix-livetest

Conversation

@MoChilia
Copy link
Copy Markdown
Member

Description

Fix Web PubSub client live-test instability by addressing ack handling in the websocket client and updating the live test environment setup.

  • Subscribe for ack messages before sending websocket operations, so fast service acks cannot be missed.

    • Root Cause: The Web PubSub client had a race in ack handling. For ack-based operations, the old flow sent the websocket message first and only subscribed to ackMessageSink afterward. If the service returned the ack very quickly, the ack could be emitted before the client started waiting for it. Since ackMessageSink is not a durable ack cache, that ack was missed and the operation eventually timed out with Acknowledge from the service not received.

    • Customer Impact: Customers could see intermittent SendMessageFailedException failures from operations such as joinGroup, leaveGroup, sendToGroup, or sendEvent, even when the service had already processed the request successfully. During reconnect with automatic group restore, this could also surface as an unexpected RejoinGroupFailedEvent, making the client appear to fail restoring group membership after reconnect.

  • Fix CI

    • Reuse the Web PubSub service client in live tests to reduce repeated credential/client setup.
    • Provision Web PubSub and Socket.IO test resources with the roles needed by AAD live tests.
    • Make testClientCloseable wait for actual lifecycle events instead of manually counting down latches.

All SDK Contribution checklist:

  • The pull request does not introduce [breaking changes]
  • CHANGELOG is updated for new features, bug fixes or other significant changes.
  • I have read the contribution guidelines.

General Guidelines and Best Practices

  • Title of the pull request is clear and informative.
  • There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

  • Pull request includes test coverage for the included changes.

Copilot AI review requested due to automatic review settings April 30, 2026 08:03
@MoChilia MoChilia requested review from a team, chenkennt, vicancy and zackliu as code owners April 30, 2026 08:03
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves reliability of the Web PubSub Java client by eliminating an ack-handling race in the WebSocket client flow and by stabilizing live-test infrastructure/setup for CI.

Changes:

  • Fix ack-based operations to start listening for ack messages before sending the WebSocket operation message.
  • Stabilize live tests by improving close/lifecycle synchronization and reusing a single WebPubSubServiceClient in tests.
  • Update test resource provisioning to include a Socket.IO Web PubSub resource and required RBAC role assignments for AAD-based live tests.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
sdk/webpubsub/test-resources.bicep Adds Socket.IO Web PubSub test resource and assigns additional RBAC roles needed for AAD live tests.
sdk/webpubsub/azure-messaging-webpubsub/src/test/java/com/azure/messaging/webpubsub/TestUtils.java Updates default test endpoints to use https.
sdk/webpubsub/azure-messaging-webpubsub-client/src/test/java/com/azure/messaging/webpubsub/client/TestUtils.java Adjusts chained credential order for live-test authentication.
sdk/webpubsub/azure-messaging-webpubsub-client/src/test/java/com/azure/messaging/webpubsub/client/TestBase.java Reuses a static WebPubSubServiceClient across live tests to reduce repeated setup.
sdk/webpubsub/azure-messaging-webpubsub-client/src/test/java/com/azure/messaging/webpubsub/client/MockClientTests.java Adds a regression test for ack ID generation and fast-ack handling via a mock session.
sdk/webpubsub/azure-messaging-webpubsub-client/src/test/java/com/azure/messaging/webpubsub/client/ClientTests.java Makes testClientCloseable wait for actual lifecycle events (connected/stopped/disconnected).
sdk/webpubsub/azure-messaging-webpubsub-client/src/main/java/com/azure/messaging/webpubsub/client/implementation/websocket/WebSocketClientHandler.java Hardens composite buffer publishing/release behavior to avoid using invalid buffers.
sdk/webpubsub/azure-messaging-webpubsub-client/src/main/java/com/azure/messaging/webpubsub/client/WebPubSubAsyncClient.java Introduces sendMessageAndWaitForAck to subscribe for ack before send, and updates ack ID generation.
sdk/webpubsub/azure-messaging-webpubsub-client/CHANGELOG.md Documents the ack race fix in the changelog.

MoChilia and others added 4 commits April 30, 2026 16:12
…om/azure/messaging/webpubsub/client/implementation/websocket/WebSocketClientHandler.java

Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants