Skip to content

fix: free table68k + branch_condition_table on shutdown (closes #125)#126

Merged
JoeMatt merged 8 commits intodevelopfrom
fix/asan-leaks-table68k-branch-cond
May 1, 2026
Merged

fix: free table68k + branch_condition_table on shutdown (closes #125)#126
JoeMatt merged 8 commits intodevelopfrom
fix/asan-leaks-table68k-branch-cond

Conversation

@JoeMatt
Copy link
Copy Markdown
Collaborator

@JoeMatt JoeMatt commented May 1, 2026

Summary

Closes #125 — the two ASAN leaks the new sanitizers CI job surfaced after PR #121 landed.

What leaked

Allocation Size Site Init by Now freed by
table68k ~1.5 MB src/m68000/readcpu.c:928 read_table68k() (lazy, on first m68k_pulse_reset) new m68k_done()
branch_condition_table 256 B src/tom/gpu.c:257 build_branch_condition_table() (called from GPUInit) new GPUDone()

Both were one-shot init-time allocations with no matching free. In a one-and-done test run that's process-lifetime noise, but in a libretro core repeatedly dlopen/dlclose'd (Provenance iOS, RetroArch core hot-swap), each cycle leaked another 1.5 MB.

Fix

  • GPUDone() (new): frees + NULLs branch_condition_table. Idempotent. Called from TOMDone() alongside the existing OPDone() / BlitterDone().
  • m68k_done() (new): frees + NULLs table68k, resets emulation_initialized so a subsequent m68k_pulse_reset re-builds the opcode table cleanly. Hoisted emulation_initialized from function-scope to file-scope to make this possible. Called from JaguarDone() after the other subsystems.
  • Both new functions are safe to call without a matching init (free(NULL) is a no-op).

Test plan

  • make -j4 builds clean
  • bash scripts/c89-lint.sh passes
  • Production exports unchanged: 46 _retro_*, 0 leaks via nm -gU
  • CI sanitizers job goes from advisory-fail (the 2 leaks above) to clean — that's the demonstration this PR is correct
  • After merge, candidate to flip sanitizers from continue-on-error: true to false so future leaks gate

Branch base

  • Base is develop (per GitFlow)

The sanitizers CI job (added in PR #121) surfaced two ASAN leaks
totaling ~1.5 MB on every test run after the m68000 UBSAN noise
was filtered out:

  1. table68k (src/m68000/readcpu.c:928) -- 65536 * sizeof(struct
     instr) ~= 1.5 MB.  Allocated lazily on first m68k_pulse_reset,
     never freed.
  2. branch_condition_table (src/tom/gpu.c:257) -- 256 bytes (32 * 8
     conditional-flag LUT).  Allocated by GPUInit ->
     build_branch_condition_table, never freed.

Both are process-lifetime one-time inits.  In a libretro core that
gets dlopen'd / dlclose'd repeatedly (e.g., the iOS Provenance host),
these leaked on every load cycle.

Fix:
  - Add GPUDone() -- frees branch_condition_table, NULLs the pointer
    so a subsequent GPUInit re-allocates cleanly.  Wired into
    TOMDone alongside OPDone() / BlitterDone().
  - Add m68k_done() -- frees table68k, NULLs the pointer, resets the
    file-scope `emulation_initialized` flag (was function-static in
    m68k_pulse_reset; hoisted to file scope so the new done can
    reset it).  Wired into JaguarDone after the existing subsystem
    Done() calls.

Both done functions are idempotent and safe to call without a prior
init (free(NULL) is a no-op).

Closes #125.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Copilot AI review requested due to automatic review settings May 1, 2026 07:19
@JoeMatt JoeMatt added bug 🐛 performance 🚀 gpu TOM GPU (graphics RISC) m68k Motorola 68000 (UAE-derived main CPU) tests test harnesses, regression baselines labels May 1, 2026
@JoeMatt JoeMatt self-assigned this May 1, 2026
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Regression: macos-arm64

Regression Test Results

ROM Status Details Diff
jagniccc ✅ PASS 0 pixels differ -
yarc ✅ PASS 0 pixels differ -
jagniccc (determinism) ✅ PASS identical across runs -
yarc (determinism) ✅ PASS identical across runs -
jagniccc (frameskip) ✅ PASS skip=0 matches skip=3 -
yarc (frameskip) ✅ PASS skip=0 matches skip=3 -
jagniccc (save state) ✅ PASS round-trip matches -
yarc (save state) ✅ PASS round-trip matches -
jagniccc (rewind) ✅ PASS rewind matches -
yarc (rewind) ✅ PASS rewind matches -

Platform: Darwin arm64

Updated by CI at 2026-05-01T23:09:00.353Z

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Regression: linux-arm64

Regression Test Results

ROM Status Details Diff
jagniccc ✅ PASS 0 pixels differ -
yarc ✅ PASS 0 pixels differ -
jagniccc (determinism) ✅ PASS identical across runs -
yarc (determinism) ✅ PASS identical across runs -
jagniccc (frameskip) ✅ PASS skip=0 matches skip=3 -
yarc (frameskip) ✅ PASS skip=0 matches skip=3 -
jagniccc (save state) ✅ PASS round-trip matches -
yarc (save state) ✅ PASS round-trip matches -
jagniccc (rewind) ✅ PASS rewind matches -
yarc (rewind) ✅ PASS rewind matches -

Platform: Linux aarch64

Updated by CI at 2026-05-01T23:09:10.606Z

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 1, 2026

Regression: linux-x64

Regression Test Results

ROM Status Details Diff
jagniccc ✅ PASS 0 pixels differ -
yarc ✅ PASS 0 pixels differ -
jagniccc (determinism) ✅ PASS identical across runs -
yarc (determinism) ✅ PASS identical across runs -
jagniccc (frameskip) ✅ PASS skip=0 matches skip=3 -
yarc (frameskip) ✅ PASS skip=0 matches skip=3 -
jagniccc (save state) ✅ PASS round-trip matches -
yarc (save state) ✅ PASS round-trip matches -
jagniccc (rewind) ✅ PASS rewind matches -
yarc (rewind) ✅ PASS rewind matches -

Platform: Linux x86_64

Updated by CI at 2026-05-01T23:09:06.418Z

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes two shutdown-time heap leaks (GPU branch-condition LUT and UAE 68K opcode table) so ASAN/LSan runs are clean when the core is repeatedly loaded/unloaded (e.g., dlopen/dlclose scenarios), closing #125.

Changes:

  • Add GPUDone() to free/NULL branch_condition_table, and call it from TOMDone().
  • Add m68k_done() to free/NULL table68k and reset initialization state, and call it from JaguarDone().
  • Expose the new shutdown hooks via gpu.h and m68kinterface.h.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/tom/tom.c Calls GPUDone() during TOM shutdown to release GPU LUT allocation.
src/tom/gpu.h Exposes GPUDone() prototype.
src/tom/gpu.c Implements GPUDone() to free/NULL branch_condition_table.
src/m68000/m68kinterface.h Exposes m68k_done() prototype.
src/m68000/m68kinterface.c Implements m68k_done() and makes init flag file-scope so it can be reset on shutdown.
src/core/jaguar.c Calls m68k_done() during Jaguar shutdown after other subsystems.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/m68000/m68kinterface.c Outdated
@JoeMatt JoeMatt requested a review from Copilot May 1, 2026 07:42
CI on PR #126 (the m68k_done / GPUDone leak fix) surfaced two
issues:

1. ASAN sanitizers job still reported the same 1.5MB + 256B leak
   the PR set out to fix.  Root cause: test/test_hle_bios.c calls
   p_retro_load_game twice (NTSC then PAL) but only calls
   p_retro_unload_game ONCE (between the two loads).  The PAL load
   never gets unloaded -- so JaguarDone() (which now does
   m68k_done() + GPUDone()) is never called for the second load.

   Added the missing p_retro_unload_game() before the final
   p_retro_deinit() at the end of main.

2. clang-tidy fired NEW pre-existing findings on src/core/jaguar.c
   and test/test_hle_bios.c because both are in this PR's diff (the
   workflow runs clang-tidy on changed files, not changed lines):
   - bugprone-incorrect-roundings on USEC_TO_RISC_CYCLES /
     USEC_TO_M68K_CYCLES in src/core/event.h.  Idiom is
     `(double + 0.5)` integer cast; lround() is C99 and we're
     C89/GNU89.
   - bugprone-multi-level-implicit-pointer-conversion on
     test_hle_bios.c's dlsym() patterns.

   Both added to the disabled list in .clang-tidy with
   documented rationale.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes two shutdown-time memory leaks found by the new ASAN CI job by adding explicit teardown hooks for one-shot lookup tables, ensuring repeated dlopen/dlclose cycles don’t accumulate leaked memory (closes #125).

Changes:

  • Add GPUDone() to free branch_condition_table and call it from TOMDone().
  • Add m68k_done() to free table68k, reset init state, and call it from JaguarDone().
  • Update the HLE BIOS test harness to call p_retro_unload_game() so the core’s shutdown path runs under ASAN.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/test_hle_bios.c Ensures unload path executes so teardown runs and leaks don’t show in ASAN tests.
src/tom/tom.c Invokes new GPU teardown during TOM shutdown.
src/tom/gpu.h Exposes GPUDone() prototype for callers.
src/tom/gpu.c Implements GPU teardown freeing the branch-condition LUT.
src/m68000/m68kinterface.h Exposes m68k_done() prototype.
src/m68000/m68kinterface.c Implements m68k_done() and makes init flag file-scope so it can be reset.
src/core/jaguar.c Calls m68k_done() during emulator shutdown.
.clang-tidy Disables two additional bugprone checks with rationale comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/m68000/m68kinterface.c Outdated
Comment thread test/test_hle_bios.c Outdated
Three findings from Copilot's review of the leak fix:

1. src/m68000/m68kinterface.c had a redundant `extern struct instr *
   table68k;` -- already declared in readcpu.h.  Removed.
2. m68k_done() reaching directly into table68k (owned by readcpu.c)
   was a brittle cross-module dependency.  Moved the actual free into
   a new free_table68k() in readcpu.c (declared in readcpu.h
   alongside read_table68k).  m68k_done now just calls free_table68k()
   + resets emulation_initialized.  Ownership stays end-to-end inside
   the module that allocates.
3. test/test_hle_bios.c inline comment was too long for one line.
   Converted to a multi-line block above the call.

Build sanity: make -j4 clean, c89-lint passes, dylib still gates 46
retro_* exports.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

Fixes two shutdown-time ASAN leaks (#125) by adding symmetric “Done” cleanup paths for one-shot initialization allocations, and updating the test harness and teardown sequences to exercise them properly.

Changes:

  • Add GPUDone() and call it from TOMDone() to free branch_condition_table.
  • Add m68k_done() and call it from JaguarDone() to free table68k and reset lazy-init state.
  • Update the HLE BIOS test harness to unload the game before deinit so shutdown cleanup runs.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
test/test_hle_bios.c Ensures retro_unload_game() is invoked so core shutdown paths run during tests.
src/tom/tom.c Hooks GPU teardown into TOM shutdown.
src/tom/gpu.h Exposes GPUDone() for teardown wiring.
src/tom/gpu.c Implements GPUDone() to free the branch-condition LUT.
src/m68000/readcpu.h Exposes free_table68k() so the allocator module also owns teardown.
src/m68000/readcpu.c Implements free_table68k() to release the opcode table.
src/m68000/m68kinterface.h Exposes m68k_done() for core teardown wiring.
src/m68000/m68kinterface.c Implements m68k_done() and makes lazy-init state resettable.
src/core/jaguar.c Calls m68k_done() during shutdown.
.clang-tidy Disables two additional clang-tidy checks and documents rationale.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/m68000/m68kinterface.c Outdated
Comment thread src/tom/gpu.c Outdated
JoeMatt and others added 3 commits May 1, 2026 18:38
Two small follow-ups:

1. src/m68000/m68kinterface.c: convert the four new // comments I
   added (m68k_done + emulation_initialized) to /* */ form per
   CLAUDE.md ("// allowed but prefer /* */ for new code").
2. src/tom/gpu.c: drop the redundant `if (branch_condition_table)`
   guard around the free + NULL pair.  free(NULL) is a no-op; the
   comment already documented the property.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
ASAN on PR #126 surfaced a global-buffer-overflow read 1 byte past
tomRam8 in tom_render_16bpp_cry_scanline (src/tom/tom.c:625).
Same pattern in all five render variants: the per-pixel loop
walks current_line_buffer (= tomRam8[0x1800], 10240 bytes max)
based on tomWidth, which can be set by display registers to a
value larger than the line buffer holds.

Fix: new helper tom_clamp_line_buffer_width() that, given the
current cursor, the requested width, the per-iteration source-byte
cost (2 for 16bpp, 4 for 24bpp), and the pwidth_scale, returns a
clamped width that guarantees the loop won't read past the end of
tomRam8.

Applied to all five render entry points:
  - tom_render_16bpp_cry_rgb_mix_scanline
  - tom_render_16bpp_cry_scanline      (the one ASAN caught)
  - tom_render_24bpp_scanline
  - tom_render_16bpp_direct_scanline
  - tom_render_16bpp_rgb_scanline

Closes #127.  With this in, the sanitizers CI job should go fully
clean -- candidate to flip continue-on-error: true to false in a
follow-up.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
UBSAN on PR #126 found:

  src/tom/gpu.c:1674: shift exponent 32 too large for uint32_t

The classic rotate-right idiom

  (RN >> r1) | (RN << (32 - r1))

invokes UB when r1 == 0 (32-bit shift count of 32 is out of range).
Replace with the standard portable pattern:

  (RN >> r1) | (RN << ((-r1) & 31))

which yields RN | RN == RN for r1 == 0 and is identical for
r1 in [1, 31].  Modern GCC/clang pattern-match this idiom into a
single rotate instruction (rorl on x86, ror on ARM).

Six sites across two files:
  - src/tom/gpu.c       gpu_opcode_ror, gpu_opcode_rorq
  - src/jerry/dsp.c     dsp_opcode_ror, dsp_opcode_rorq, plus the
                        prefetched (PRN) variants

Co-Authored-By: Claude Opus 4.7 <[email protected]>
JoeMatt and others added 2 commits May 1, 2026 19:00
Round-2 sanitizer fix for PR #126:

UBSAN at src/tom/gpu.c:1684 -- "shift exponent 32 too large for
uint32_t".  In gpu_opcode_rorq / dsp_opcode_rorq / DSP_rorq the
rotation count comes from gpu_convert_zero[] / dsp_convert_zero[],
which maps a 0 IMM_1 to 32 (rotate-by-0 means rotate-by-full-word,
a no-op).  But `RN >> 32` is UB regardless of what the post-shift
result is, so the previous-commit fix (only masking the LHS) wasn't
enough.

Mask r1 to 0x1F before either shift -- maps 32 -> 0, preserving the
no-op semantic.  Three sites:
  - src/tom/gpu.c     gpu_opcode_rorq
  - src/jerry/dsp.c   dsp_opcode_rorq, DSP_rorq

Also add `-clang-analyzer-optin.performance.Padding` to the disabled
list in .clang-tidy -- it fires on inherited UAE/Virtual Jaguar
struct layouts where reordering for tighter packing risks silent
layout breaks in save-state / dlsym paths.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
The previous commit added the rationale comment but the disable
line itself didn't get written.  Add it.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@JoeMatt JoeMatt merged commit b73a11c into develop May 1, 2026
30 checks passed
@JoeMatt JoeMatt deleted the fix/asan-leaks-table68k-branch-cond branch May 3, 2026 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug 🐛 gpu TOM GPU (graphics RISC) m68k Motorola 68000 (UAE-derived main CPU) performance 🚀 tests test harnesses, regression baselines

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants