Skip to content

Remove macOS Socket.Select workaround that caused Cleanup() to hang#1142

Open
follesoe wants to merge 1 commit intozeromq:masterfrom
follesoe:fix/macos-poller-hang
Open

Remove macOS Socket.Select workaround that caused Cleanup() to hang#1142
follesoe wants to merge 1 commit intozeromq:masterfrom
follesoe:fix/macos-poller-hang

Conversation

@follesoe
Copy link
Copy Markdown

Summary

  • Removes the macOS-specific Socket.Select workaround in Poller.cs that split the call into two separate invocations (one for readList, one for errorList)
  • The second call blocked forever with infinite timeout, preventing the Reaper from processing Stop commands, causing Cleanup() to hang indefinitely
  • The underlying .NET runtime bug (dotnet/corefx#39617) was reported in 2019 and fixed in .NET 9 (2024), but the Cleanup hang indicates the workaround never worked as intended

See #1040 (comment) for a detailed root cause analysis.

Fixes #1040

Test plan

  • Verified build succeeds on all target frameworks (net8.0, net472, netstandard2.1)
  • Verified on macOS (Apple Silicon) with .NET 10 that Cleanup(block: false) no longer hangs

🤖 Generated with Claude Code

The macOS workaround split Socket.Select into two separate calls (one
for readList, one for errorList). When the timeout was infinite (-1),
the second call blocked forever waiting for an error condition that
never occurred, preventing the Reaper from processing Stop commands
and causing Cleanup() to hang indefinitely.

The underlying .NET runtime bug (dotnet/corefx#39617) was reported in
2019 and fixed in .NET 9 (2024). However, the Cleanup hang indicates
the workaround never worked as intended. Now that the underlying issue
is also resolved on broadly available .NET versions for macOS and iOS,
the workaround can be safely removed.

Fixes zeromq#1040

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes a macOS-specific Socket.Select workaround in the core polling loop that could block indefinitely and prevent NetMQConfig.Cleanup() (and related stop/reaper logic) from completing.

Changes:

  • Removed the macOS RuntimeInformation.IsOSPlatform(OSPlatform.OSX) branching and the two-step Select calls.
  • Simplified the polling loop to a single Socket.Select(readList, null, errorList, timeout) invocation across targets.
  • Dropped the now-unused System.Runtime.InteropServices import.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 280 to 282
timeout = timeout != 0 ? timeout * 1000 : -1;
#if NETFRAMEWORK
Socket.Select(readList, null, errorList, timeout);
#else
if (RuntimeInformation.IsOSPlatform(OSPlatform.OSX))
{
// Socket.Select does not work properly on macOS .NET Core when readList and errorList are passed
// together. To avoid this problem, we call the Select function separately for errorList.
// Please refer to this issue: https://github.com/dotnet/corefx/issues/39617
SocketUtility.Select(readList, null, null, timeout);
SocketUtility.Select(null, null, errorList, timeout);
}
else
{
Socket.Select(readList, null, errorList, timeout);
}
#endif
}
Copy link

Copilot AI Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change fixes a macOS-specific hang, but the repo’s CI doesn’t run tests on macOS, and there’s no regression test that would fail if Socket.Select blocks forever in the poll loop. Consider adding a regression test that exercises Cleanup(block: false) under the same conditions as #1040 and asserts it completes within a bounded time (and/or add a macOS CI job so the test actually runs on the affected platform).

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NetMQConfig.Cleanup() hangs forever even after dispose

3 participants