Skip to content

Commit c812e77

Browse files
krisman-at-collaboraxdevs23
authored andcommitted
futex: Implement mechanism to wait on any of several futexes
This is a new futex operation, called FUTEX_WAIT_MULTIPLE, which allows a thread to wait on several futexes at the same time, and be awoken by any of them. In a sense, it implements one of the features that was supported by pooling on the old FUTEX_FD interface. The use case lies in the Wine implementation of the Windows NT interface WaitMultipleObjects. This Windows API function allows a thread to sleep waiting on the first of a set of event sources (mutexes, timers, signal, console input, etc) to signal. Considering this is a primitive synchronization operation for Windows applications, being able to quickly signal events on the producer side, and quickly go to sleep on the consumer side is essential for good performance of those running over Wine. Wine developers have an implementation that uses eventfd, but it suffers from FD exhaustion (there is applications that go to the order of multi-milion FDs), and higher CPU utilization than this new operation. The futex list is passed as an array of `struct futex_wait_block` (pointer, value, bitset) to the kernel, which will enqueue all of them and sleep if none was already triggered. It returns a hint of which futex caused the wake up event to userspace, but the hint doesn't guarantee that is the only futex triggered. Before calling the syscall again, userspace should traverse the list, trying to re-acquire any of the other futexes, to prevent an immediate -EWOULDBLOCK return code from the kernel. This was tested using three mechanisms: 1) By reimplementing FUTEX_WAIT in terms of FUTEX_WAIT_MULTIPLE and running the unmodified tools/testing/selftests/futex and a full linux distro on top of this kernel. 2) By an example code that exercises the FUTEX_WAIT_MULTIPLE path on a multi-threaded, event-handling setup. 3) By running the Wine fsync implementation and executing multi-threaded applications, in particular modern games, on top of this implementation. Changes were tested for the following ABIs: x86_64, i386 and x32. Support for x32 applications is not implemented since it would take a major rework adding a new entry point and splitting the current futex 64 entry point in two and we can't change the current x32 syscall number without breaking user space compatibility. CC: Steven Rostedt <[email protected]> Cc: Richard Yao <[email protected]> Cc: Thomas Gleixner <[email protected]> Cc: Peter Zijlstra <[email protected]> Co-developed-by: Zebediah Figura <[email protected]> Signed-off-by: Zebediah Figura <[email protected]> Co-developed-by: Steven Noonan <[email protected]> Signed-off-by: Steven Noonan <[email protected]> Co-developed-by: Pierre-Loup A. Griffais <[email protected]> Signed-off-by: Pierre-Loup A. Griffais <[email protected]> Signed-off-by: Gabriel Krisman Bertazi <[email protected]> [Added compatibility code] Co-developed-by: André Almeida <[email protected]> Signed-off-by: André Almeida <[email protected]> Adjusted for v5.9: Removed `put_futex_key` calls. Signed-off-by: Simão Gomes Viana <[email protected]>
1 parent ff9472f commit c812e77

2 files changed

Lines changed: 370 additions & 2 deletions

File tree

include/uapi/linux/futex.h

Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,7 @@
2121
#define FUTEX_WAKE_BITSET 10
2222
#define FUTEX_WAIT_REQUEUE_PI 11
2323
#define FUTEX_CMP_REQUEUE_PI 12
24+
#define FUTEX_WAIT_MULTIPLE 13
2425

2526
#define FUTEX_PRIVATE_FLAG 128
2627
#define FUTEX_CLOCK_REALTIME 256
@@ -40,6 +41,8 @@
4041
FUTEX_PRIVATE_FLAG)
4142
#define FUTEX_CMP_REQUEUE_PI_PRIVATE (FUTEX_CMP_REQUEUE_PI | \
4243
FUTEX_PRIVATE_FLAG)
44+
#define FUTEX_WAIT_MULTIPLE_PRIVATE (FUTEX_WAIT_MULTIPLE | \
45+
FUTEX_PRIVATE_FLAG)
4346

4447
/*
4548
* Support for robust futexes: the kernel cleans up held futexes at
@@ -150,4 +153,21 @@ struct robust_list_head {
150153
(((op & 0xf) << 28) | ((cmp & 0xf) << 24) \
151154
| ((oparg & 0xfff) << 12) | (cmparg & 0xfff))
152155

156+
/*
157+
* Maximum number of multiple futexes to wait for
158+
*/
159+
#define FUTEX_MULTIPLE_MAX_COUNT 128
160+
161+
/**
162+
* struct futex_wait_block - Block of futexes to be waited for
163+
* @uaddr: User address of the futex
164+
* @val: Futex value expected by userspace
165+
* @bitset: Bitset for the optional bitmasked wakeup
166+
*/
167+
struct futex_wait_block {
168+
__u32 __user *uaddr;
169+
__u32 val;
170+
__u32 bitset;
171+
};
172+
153173
#endif /* _UAPI_LINUX_FUTEX_H */

0 commit comments

Comments
 (0)