Commit b894479
committed
libretro-common: add retro_spsc.h portable lock-free SPSC byte queue
A lock-free single-producer / single-consumer byte queue built on
retro_atomic.h. Wired into the standard libretro-common build
(Makefile.common, griffin) so it ships with every RetroArch
configuration, but no production callers yet -- this commit lands
the primitive only. Subsequent commits can convert hand-rolled
SPSC patterns (audio/drivers/coreaudio.c's atomic_size_t ring,
audio/drivers/opensl.c's __sync_*-guarded buffered_blocks counter)
to use it.
Design
------
Two-cursor (head + tail) Lamport / Disruptor style, NOT the single
shared count. Each cursor is written by exactly one thread and
acquire-loaded by the other. The producer publishes new data with
a release-store on head; the consumer publishes consumed bytes
with a release-store on tail. Neither thread issues atomic RMW
operations on the hot path -- only retro_atomic_load_acquire_size
and retro_atomic_store_release_size, which retro_atomic.h provides
across all 7 backends.
Capacity is power-of-2 (rounded up at init), masked with
capacity - 1. Wraparound uses the standard two-memcpy pattern.
size_t modular arithmetic on (head - tail) is well-defined as long
as the difference stays within capacity, which the producer
enforces by checking write_avail before each push.
head and tail are placed on separate cache lines via explicit
char-array padding (RETRO_SPSC_CACHE_LINE, default 64). Without
this, the producer's release-store on head would invalidate the
consumer's tail cache line and vice versa, halving throughput on
contended SMP. Padding is a performance hint; correctness does
not depend on it. The padding macros are guarded against
underflow if RETRO_SPSC_CACHE_LINE is misconfigured to be smaller
than the prefix fields.
Comparison with the existing fifo_queue_t:
- fifo_queue takes an slock_t internally; it's MPMC-safe but
every push/pop costs a mutex.
- retro_spsc is lock-free but limited to one producer / one
consumer. Use it in code paths where (a) producer/consumer
counts are fixed at one each (audio render thread <-> audio
submission, video thread <-> task queue, etc.) and (b) the
fifo_queue lock contention measurably matters.
Comparison with audio/drivers/coreaudio.c's hand-rolled ring:
- coreaudio uses a single shared atomic_size_t `filled` count
rather than two cursors; producer fetch_add's it, consumer
fetch_sub's it. Correct, but optimises for "give me
write_avail" being a single atomic load (audio drivers do
that query every fill) at the cost of RMW on every push and
pop.
- retro_spsc inverts that trade-off: write_avail / read_avail
each cost two atomic loads, but push and pop only do
load_acquire + store_release. Better for the general case
where push/pop count vastly exceeds availability checks.
- retro_spsc also pads head and tail; coreaudio doesn't (its
single shared atomic doesn't suffer false sharing).
Build wiring
------------
Build requirements:
- retro_atomic.h (header-only, always available)
- <stddef.h>, <stdint.h>, <stdbool.h>, <stdlib.h>, <string.h>
- That's it. No HAVE_THREADS gate is needed for compilation:
retro_spsc.c builds on every target retro_atomic.h supports,
including pre-thread console targets (PSP, original Wii).
On a single-threaded build it just sits there as dead code.
Wired in through:
- Makefile.common adds retro_spsc.o next to fifo_queue.o under
the unconditional libretro-common object list.
- griffin.c #includes retro_spsc.c next to fifo_queue.c (Apple
builds and console unity builds pick it up via griffin).
Test
----
libretro-common/samples/queues/retro_spsc_test/ exercises the
queue under producer/consumer concurrency:
- Property checks (single-threaded): power-of-2 round-up at init,
fresh-queue avails, push/pop content round-trip, peek does not
advance, wraparound across the buffer end.
- SPSC stress (HAVE_THREADS): producer pushes 10M sequential
32-bit tokens through a 4 KB buffer; consumer reads and
verifies each token matches the expected sequence. The
small buffer relative to the message volume forces heavy
interleaving between producer and consumer, exercising the
wraparound path on every iteration.
The stress harness has no per-iteration handshake -- both threads
spin on avail() in tight loops -- so the test is sensitive to
real synchronisation bugs. Verified: deliberately moving the
producer's release-store on head BEFORE the buffer memcpys (so a
consumer observing the new head can read uninitialised bytes)
makes the stress fail with hundreds of mismatched tokens out of
10M, even on x86 TSO without TSan. The same bug under TSan with
halt_on_error=1 exits 66 with "data race in retro_spsc_read_avail".
CI
--
.github/workflows/Linux-libretro-common-samples.yml:
- retro_spsc_test added to RUN_TARGETS so the auto-discovery
job builds and runs it under SANITIZER=address,undefined
(the workflow default).
- A new step builds and runs the test under Clang +
SANITIZER=thread with TSAN_OPTIONS=halt_on_error=1, mirroring
the retro_atomic_test TSan lane. TSan is the strict
validator for this primitive specifically: missing-barrier
regressions show up as data races even on x86 TSO, where the
hardware would otherwise hide them at runtime.
Verified locally:
- gcc -O0/-O2/-O3 -Wall -Werror, x86_64
- clang -O2, x86_64
- g++ -xc++ -std=c++11 (CXX_BUILD-style)
- aarch64-linux-gnu-gcc + qemu-user, 10M tokens clean,
objdump shows real ldar/stlr at the cursor accesses
- arm-linux-gnueabihf-gcc compile-clean
- Forced backends: C11 stdatomic, GCC __sync_*, volatile
fallback (volatile fallback correct only on x86 TSO, same
contract as every other retro_atomic.h caller)
- TSan halt_on_error=1: clean run, exit 0
- TSan halt_on_error=1 on bug-injected build: exit 66
- ASan/UBSan: clean
- Bug-injected without sanitizer: 815 mismatches / 10M tokens
Not verified on real hardware:
- MSVC ARM64 (inherits retro_atomic.h's untested-on-hardware
state for that backend)
- Real PowerPC SMP (Wii U, Xbox 360)1 parent b84cdac commit b894479
7 files changed
Lines changed: 758 additions & 0 deletions
File tree
- .github/workflows
- griffin
- libretro-common
- include
- queues
- samples/queues/retro_spsc_test
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
80 | 80 | | |
81 | 81 | | |
82 | 82 | | |
| 83 | + | |
83 | 84 | | |
84 | 85 | | |
85 | 86 | | |
| |||
289 | 290 | | |
290 | 291 | | |
291 | 292 | | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
292 | 318 | | |
293 | 319 | | |
294 | 320 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
364 | 364 | | |
365 | 365 | | |
366 | 366 | | |
| 367 | + | |
367 | 368 | | |
368 | 369 | | |
369 | 370 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
815 | 815 | | |
816 | 816 | | |
817 | 817 | | |
| 818 | + | |
818 | 819 | | |
819 | 820 | | |
820 | 821 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
| 134 | + | |
| 135 | + | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
| 149 | + | |
| 150 | + | |
| 151 | + | |
| 152 | + | |
| 153 | + | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
| 182 | + | |
| 183 | + | |
| 184 | + | |
| 185 | + | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
| 191 | + | |
| 192 | + | |
| 193 | + | |
| 194 | + | |
| 195 | + | |
| 196 | + | |
| 197 | + | |
| 198 | + | |
| 199 | + | |
| 200 | + | |
| 201 | + | |
| 202 | + | |
| 203 | + | |
| 204 | + | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
| 209 | + | |
| 210 | + | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
0 commit comments