Remove macOS Socket.Select workaround that caused Cleanup() to hang#1142
Remove macOS Socket.Select workaround that caused Cleanup() to hang#1142follesoe wants to merge 1 commit intozeromq:masterfrom
Conversation
The macOS workaround split Socket.Select into two separate calls (one for readList, one for errorList). When the timeout was infinite (-1), the second call blocked forever waiting for an error condition that never occurred, preventing the Reaper from processing Stop commands and causing Cleanup() to hang indefinitely. The underlying .NET runtime bug (dotnet/corefx#39617) was reported in 2019 and fixed in .NET 9 (2024). However, the Cleanup hang indicates the workaround never worked as intended. Now that the underlying issue is also resolved on broadly available .NET versions for macOS and iOS, the workaround can be safely removed. Fixes zeromq#1040 Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR removes a macOS-specific Socket.Select workaround in the core polling loop that could block indefinitely and prevent NetMQConfig.Cleanup() (and related stop/reaper logic) from completing.
Changes:
- Removed the macOS
RuntimeInformation.IsOSPlatform(OSPlatform.OSX)branching and the two-stepSelectcalls. - Simplified the polling loop to a single
Socket.Select(readList, null, errorList, timeout)invocation across targets. - Dropped the now-unused
System.Runtime.InteropServicesimport.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| timeout = timeout != 0 ? timeout * 1000 : -1; | ||
| #if NETFRAMEWORK | ||
| Socket.Select(readList, null, errorList, timeout); | ||
| #else | ||
| if (RuntimeInformation.IsOSPlatform(OSPlatform.OSX)) | ||
| { | ||
| // Socket.Select does not work properly on macOS .NET Core when readList and errorList are passed | ||
| // together. To avoid this problem, we call the Select function separately for errorList. | ||
| // Please refer to this issue: https://github.com/dotnet/corefx/issues/39617 | ||
| SocketUtility.Select(readList, null, null, timeout); | ||
| SocketUtility.Select(null, null, errorList, timeout); | ||
| } | ||
| else | ||
| { | ||
| Socket.Select(readList, null, errorList, timeout); | ||
| } | ||
| #endif | ||
| } |
There was a problem hiding this comment.
This change fixes a macOS-specific hang, but the repo’s CI doesn’t run tests on macOS, and there’s no regression test that would fail if Socket.Select blocks forever in the poll loop. Consider adding a regression test that exercises Cleanup(block: false) under the same conditions as #1040 and asserts it completes within a bounded time (and/or add a macOS CI job so the test actually runs on the affected platform).
Summary
Socket.Selectworkaround inPoller.csthat split the call into two separate invocations (one forreadList, one forerrorList)Stopcommands, causingCleanup()to hang indefinitelyCleanuphang indicates the workaround never worked as intendedSee #1040 (comment) for a detailed root cause analysis.
Fixes #1040
Test plan
net8.0,net472,netstandard2.1)Cleanup(block: false)no longer hangs🤖 Generated with Claude Code