Skip to content

Commit 75ea1e4

Browse files
committed
libretro-common: add retro_atomic.h portable atomics primitive
A header-only API exposing acquire/release atomic loads, stores and acq_rel fetch_add/fetch_sub for int and size_t, with a backend cascade that picks the best primitive each toolchain offers: 1. C11 <stdatomic.h> - modern GCC/Clang/MSVC at -std=c11 2. C++11 <atomic> - C++ TUs at -std=c++11 or _MSVC_LANG >= 201103L 3. GCC __atomic_* - GCC 4.7+ / Clang 3.1+ (Clang impersonates GCC 4.2 in __GNUC__, so the gate uses defined(__clang__) || version check to avoid falling through to __sync) 4. MSVC Win32 Interlocked* - VS2003+, OG Xbox, Xbox 360 XDK; on ARM/ARM64 the plain forms lack barriers (PostgreSQL hit this on Win11/ARM64 in 2025), so RMWs are bracketed with __dmb on those targets 5. Apple OSAtomic* - PPC / pre-10.7 fallback 6. GCC __sync_* - GCC 4.1-4.6 7. volatile fallback - last resort, single-core / x86 TSO only; emits a #warning unless suppressed Capability flags exposed to callers: HAVE_RETRO_ATOMIC always defined after include RETRO_ATOMIC_LOCK_FREE defined iff a real backend selected (NOT for the volatile fallback) RETRO_ATOMIC_BACKEND_NAME string literal, for diagnostics RETRO_ATOMIC_REQUIRE_LOCK_FREE caller-side opt-in: setting this before include turns the volatile fallback into a #error No active TU includes the header yet; it is the foundation for a future SPSC fifo primitive and consolidates the hand-rolled atomic shims currently scattered across coreaudio*.c/m, xaudio.c, mmdevice_common.c, opensl.c, and gfx_thumbnail.c. Sample: libretro-common/samples/atomic/retro_atomic_test/ Single-threaded property checks of every macro plus a 1M-iteration SPSC stress test (when HAVE_THREADS) using rthreads sthread_create. Compile-time #error checks assert that every named real backend implies RETRO_ATOMIC_LOCK_FREE and that the volatile fallback never sets it. CI: Linux-libretro-common-samples.yml - retro_atomic_test added to the native run allowlist (gcc, with the workflow's default ASan/UBSan) - new step: C++ smoke test compiled with both g++ and clang++ at -std=c++11/14/17 against the in-tree header - new step: retro_atomic_test built with clang -fsanitize=thread and run with TSAN_OPTIONS=halt_on_error=1; TSan instruments every atomic load/store and would flag a missing barrier in the SPSC stress that x86 TSO would otherwise hide - new job: retro-atomic-cross, matrix [aarch64, armv7], cross-compiles with gcc-aarch64-linux-gnu / gcc-arm-linux-gnueabihf, runs the binary under qemu-user-static, and grep-inspects the emitted asm for ldar/stlr/ldadd*_acq_rel (aarch64) or dmb/ldrex/strex (armv7); the inspect step exits 1 if no barrier mnemonics are found, which catches a silent regression to the volatile fallback Verified locally: - x86_64 native (gcc, clang) + ASan/UBSan + TSan - AArch64 cross-compile + qemu, asm shows ldar/stlr/ldadd*_acq_rel - ARMv7 cross-compile + qemu, asm shows dmb/ldrex/strex - MIPSel cross-compile + qemu, asm shows ll/sc/sync - C++11/14/17/20 native (g++ and clang++) - C++98 (g++ and clang++) correctly falls through to GCC __atomic_* - All 9 forced-backend shape tests (C11, GCC __atomic_*, __sync_*, volatile, MSVC x86/x64/ARM64 mocked, Apple 32/64 mocked) plus forced C++11 Not verified on real hardware: - MSVC ARM64 (correct by construction from MS docs and PostgreSQL precedent; awaits Windows-on-ARM CI) - Real PowerPC SMP (Wii U, Xbox 360); reasoned from devkitPPC GCC and Microsoft's Xbox 360 lockless guide - __sync_* and Apple OSAtomic backends (dead code on every current target; selection requires GCC < 4.7 or pre-10.7 Apple)
1 parent 677454f commit 75ea1e4

4 files changed

Lines changed: 1300 additions & 1 deletion

File tree

.github/workflows/Linux-libretro-common-samples.yml

Lines changed: 193 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ jobs:
2525
- name: Install dependencies
2626
run: |
2727
sudo apt-get update -y
28-
sudo apt-get install -y build-essential zlib1g-dev
28+
sudo apt-get install -y build-essential zlib1g-dev clang
2929
3030
- name: Checkout
3131
uses: actions/checkout@v3
@@ -79,6 +79,7 @@ jobs:
7979
word_wrap_overflow_test
8080
task_queue_title_error_test
8181
tpool_wait_test
82+
retro_atomic_test
8283
)
8384
8485
# Per-binary run command (overrides ./<binary> if present).
@@ -196,3 +197,194 @@ jobs:
196197
if [[ $fails -gt 0 ]]; then
197198
exit 1
198199
fi
200+
201+
- name: Compile-test retro_atomic.h from a C++11 TU
202+
shell: bash
203+
working-directory: libretro-common
204+
run: |
205+
# The C++11 backend in retro_atomic.h is a fresh code path that
206+
# none of the C samples above exercise. Compile a tiny inline
207+
# C++11 TU against the in-tree header to catch regressions like
208+
# accidentally re-introducing an extern "C" wrapper around the
209+
# std::atomic include, or breaking the __cplusplus / _MSVC_LANG
210+
# gate. This step is build-and-run, single-threaded only -- the
211+
# behavioural SPSC stress is already covered by the C test
212+
# binary above on this same host, and the C++11 backend bottoms
213+
# out through the same libstdc++ __atomic_* builtins.
214+
set -u
215+
set -o pipefail
216+
217+
tmpdir=$(mktemp -d)
218+
cat > "$tmpdir/cxx_smoke.cpp" <<'EOF'
219+
#include <cstdio>
220+
#include <cstddef>
221+
#include <retro_atomic.h>
222+
223+
#if !defined(HAVE_RETRO_ATOMIC) || !defined(RETRO_ATOMIC_LOCK_FREE)
224+
# error "retro_atomic.h: capability flags not set on a C++11 host"
225+
#endif
226+
227+
int main(void) {
228+
retro_atomic_int_t ai; retro_atomic_int_init(&ai, 0);
229+
retro_atomic_size_t as; retro_atomic_size_init(&as, 0);
230+
231+
retro_atomic_store_release_int(&ai, 42);
232+
retro_atomic_store_release_size(&as, (std::size_t)42);
233+
234+
int li = retro_atomic_load_acquire_int(&ai);
235+
int ls = (int)retro_atomic_load_acquire_size(&as);
236+
237+
int pi = retro_atomic_fetch_add_int(&ai, 1);
238+
int ps = (int)retro_atomic_fetch_add_size(&as, 1);
239+
240+
retro_atomic_inc_int(&ai);
241+
retro_atomic_dec_size(&as);
242+
243+
int qi = retro_atomic_load_acquire_int(&ai);
244+
int qs = (int)retro_atomic_load_acquire_size(&as);
245+
246+
std::printf("backend: %s\n", RETRO_ATOMIC_BACKEND_NAME);
247+
248+
bool ok = (li == 42) && (ls == 42)
249+
&& (pi == 42) && (ps == 42)
250+
&& (qi == 44) && (qs == 42);
251+
std::puts(ok ? "ALL OK" : "FAIL");
252+
return ok ? 0 : 1;
253+
}
254+
EOF
255+
256+
for cxx in g++ clang++; do
257+
for std in c++11 c++14 c++17; do
258+
echo "==> compile-test with $cxx -std=$std"
259+
$cxx -std=$std -Wall -Wextra -pedantic -O2 \
260+
-I include \
261+
"$tmpdir/cxx_smoke.cpp" \
262+
-o "$tmpdir/cxx_smoke" \
263+
|| { echo "::error title=C++ compile failed::$cxx -std=$std"; exit 1; }
264+
"$tmpdir/cxx_smoke" \
265+
|| { echo "::error title=C++ smoke failed::$cxx -std=$std"; exit 1; }
266+
done
267+
done
268+
269+
rm -rf "$tmpdir"
270+
271+
- name: Run retro_atomic_test under Clang + ThreadSanitizer
272+
shell: bash
273+
working-directory: libretro-common/samples/atomic/retro_atomic_test
274+
run: |
275+
# The native samples job above runs with GCC and ASan/UBSan.
276+
# Clang is the toolchain on every Apple platform, Android NDK
277+
# (since r18), Emscripten, and PS4-ORBIS, so a Clang lane is
278+
# not optional coverage. ThreadSanitizer is the strict
279+
# validator for this test in particular: it instruments every
280+
# atomic load and store and would flag a missing acquire /
281+
# release barrier as a race in the 1M-iteration SPSC stress
282+
# (a class of bug that x86 TSO would otherwise hide on the
283+
# native runner).
284+
set -u
285+
set -o pipefail
286+
287+
make clean
288+
CC=clang make all SANITIZER=thread
289+
290+
TSAN_OPTIONS=halt_on_error=1 ./retro_atomic_test
291+
292+
# Cross-architecture validation lane for retro_atomic_test.
293+
#
294+
# The samples job above runs on x86_64, which is a strongly-ordered
295+
# (TSO) architecture. retro_atomic.h's contract is that acquire-load
296+
# / release-store / acq_rel-RMW emit real barriers on weakly-ordered
297+
# SMP targets (ARM, AArch64, PowerPC, MIPS). An x86_64 host run
298+
# cannot exercise that property, because TSO masks reordering bugs
299+
# at the hardware level even when the macros emit no barriers at all.
300+
#
301+
# This job cross-compiles retro_atomic_test for AArch64 and ARMv7 and
302+
# runs the binary under qemu-user-static. qemu-user emulates the
303+
# weak memory model faithfully enough to expose missing-barrier bugs
304+
# in the SPSC stress test, and is cheap enough to run on every push.
305+
#
306+
# We deliberately do NOT run the full samples sweep here -- the rest
307+
# of the samples don't have architecture-dependent codegen that
308+
# warrants the extra CI time. retro_atomic_test is the one that
309+
# benefits from cross-arch coverage.
310+
#
311+
# Real ARM hardware still beats qemu (see e.g. PostgreSQL's 2025
312+
# Win11/ARM64 atomic ordering bug, found only on real silicon),
313+
# but qemu catches most categorical errors and is much cheaper than
314+
# provisioning ARM runners.
315+
retro-atomic-cross:
316+
name: Cross-arch retro_atomic_test (${{ matrix.arch }})
317+
runs-on: ubuntu-latest
318+
timeout-minutes: 10
319+
strategy:
320+
fail-fast: false
321+
matrix:
322+
include:
323+
- arch: aarch64
324+
cc: aarch64-linux-gnu-gcc
325+
apt_pkgs: gcc-aarch64-linux-gnu
326+
qemu: qemu-aarch64-static
327+
sysroot: /usr/aarch64-linux-gnu
328+
- arch: armv7
329+
cc: arm-linux-gnueabihf-gcc
330+
apt_pkgs: gcc-arm-linux-gnueabihf
331+
qemu: qemu-arm-static
332+
sysroot: /usr/arm-linux-gnueabihf
333+
334+
steps:
335+
- name: Install dependencies
336+
run: |
337+
sudo apt-get update -y
338+
sudo apt-get install -y build-essential ${{ matrix.apt_pkgs }} qemu-user-static
339+
340+
- name: Checkout
341+
uses: actions/checkout@v3
342+
343+
- name: Build retro_atomic_test for ${{ matrix.arch }}
344+
working-directory: libretro-common/samples/atomic/retro_atomic_test
345+
run: |
346+
set -u
347+
set -o pipefail
348+
make clean
349+
CC=${{ matrix.cc }} make all
350+
351+
- name: Run retro_atomic_test under qemu-user
352+
working-directory: libretro-common/samples/atomic/retro_atomic_test
353+
run: |
354+
set -u
355+
set -o pipefail
356+
${{ matrix.qemu }} -L ${{ matrix.sysroot }} ./retro_atomic_test
357+
358+
- name: Inspect emitted atomic instructions
359+
working-directory: libretro-common/samples/atomic/retro_atomic_test
360+
run: |
361+
set -u
362+
set -o pipefail
363+
# Spot-check the codegen. If retro_atomic.h were silently
364+
# falling through to a no-barrier backend on this arch, the
365+
# asm would be conspicuously missing acquire/release
366+
# instructions. This is a cheap sanity check on top of the
367+
# behavioural SPSC test above.
368+
${{ matrix.cc }} -O2 -S \
369+
-I../../../include -DHAVE_THREADS \
370+
retro_atomic_test.c -o /tmp/retro_atomic_test.s
371+
echo
372+
echo '== Unique barrier-emitting mnemonics =='
373+
case "${{ matrix.arch }}" in
374+
aarch64)
375+
# Expect: ldar, stlr, and __aarch64_ldadd*_acq_rel libcalls
376+
# (or inline ldaddal LSE on +lse builds).
377+
pattern='\b(ldar|stlr|ldax|stlx|dmb|ldadd[a-z0-9_]*|swp[a-z0-9_]*|__aarch64_(ldadd|swp)[a-z0-9_]*acq_rel)\b'
378+
;;
379+
armv7)
380+
# Expect: dmb (data memory barrier) and ldrex/strex pairs.
381+
pattern='\b(dmb|ldrex|strex|ldrexb|strexb|ldrexh|strexh)\b'
382+
;;
383+
esac
384+
mnemonics=$(grep -oE "$pattern" /tmp/retro_atomic_test.s | sort -u)
385+
echo "$mnemonics"
386+
if [[ -z "$mnemonics" ]]; then
387+
echo
388+
echo '::error title=No barrier instructions emitted::retro_atomic_test.s contains no acquire/release/barrier mnemonics for ${{ matrix.arch }}; retro_atomic.h may have fallen through to a no-barrier backend.'
389+
exit 1
390+
fi

0 commit comments

Comments
 (0)