acid: scaffold synthetic-ROM test toolkit by github-actions[bot] · Pull Request #130 · libretro/virtualjaguar-libretro

github-actions · 2026-05-02T20:07:06Z

Summary

Establishes the directory structure, boot/signature conventions, build glue and runner harness for a future suite of focused acid-test ROMs. Tiny open-source Jaguar ROMs that hammer specific hardware behaviour (blitter mode matrix, GPU↔Blitter sync, DSP, OP, beam chasing, cycle stress) and report PASS/FAIL to the host via a fixed RAM signature.

Why:

Reproducible benchmarks that don't depend on commercial ROMs (which we can't ship).
Exhaustive feature-axis coverage instead of relying on whatever combinations the games we test happen to exercise.
Catches divergence between fast vs accurate blitters, between our impl and hardware reference, between successive versions.

What ships

File	Purpose
`test/acid/README.md`	Design doc, vasm install steps, how to write a new test
`test/acid/include/acid_test.s`	`ACID_INIT` / `ACID_PASS` / `ACID_FAIL` macros (4-word RAM signature at $100..$10F)
`test/acid/include/jaguar_header.s`	Minimal cart header + entry vector
`test/acid/tests/blitter/copy_simple.s`	First source-form test (8-phrase blitter copy round-trip), serves as canonical template
`test/acid/Makefile`	Assembles `tests/*/.s` → `.jag` via vasm; gracefully skips if vasm absent
`test/acid/run.c`	Harness: dlopens core, loads ROM, reads signature, prints PASS/FAIL
`Makefile`	`make acid` target wires it all up
`.gitignore`	Excludes `acid_run` and `tests/*/.jag`

Status

This PR is the scaffold only. No tests are pre-built into the repo yet; every category directory (`blitter/`, `gpu/`, `dsp/`, `op/`, `timing/`) is empty save the proof-of-concept blitter test. Real test payloads land in follow-up PRs.

Known follow-ups

The `jaguar_header.s` boot stub is a best-effort transcription of the standard cart layout but has not yet been verified end-to-end (no `vasm` installed on the dev machine for this PR). Once we get vasm built somewhere we'll bring `copy_simple.jag` up against the live core and adjust the header / authentication-bypass interaction as needed.
`vasm` isn't wired into CI yet. We'll add a CI job once the toolkit is verified working locally.
This PR is stacked on top of perf(blitter): +15% AvP gameplay accurate via inlining ADDARRAY/DATA/COMP_CTRL #129 (perf/blitter inlining + diagnostics infra). Once perf(blitter): +15% AvP gameplay accurate via inlining ADDARRAY/DATA/COMP_CTRL #129 merges to develop this can be rebased and reviewed without the prior diff in the way.

Test plan

`make` (default, no vasm): builds, no impact
`make -C test/acid all`: builds runner, prints "vasm not found, skipped" when absent
`make acid` (no vasm): "Nothing to run (no .jag ROMs assembled)" with exit 0
Install vasm + verify `copy_simple.jag` boots and reports PASS
Wire vasm into a CI job

🤖 Generated with Claude Code

Establishes the directory structure, boot/signature conventions, build glue and runner harness for a future suite of focused acid-test ROMs. Per the user's earlier idea: ship small, open-source Jaguar ROMs that hammer specific hardware behaviour and report pass/fail to the host via a fixed RAM signature, so we can: * benchmark deterministically without depending on commercial ROMs (which we cannot ship), and * exhaustively cover feature axes (every blitter pixsize / phrase mode / Z mode etc.) instead of relying on whatever combinations the games we happen to test exercise. What this commit ships: * `test/acid/README.md` -- design doc, signature convention, vasm install steps, how to write a new test. * `test/acid/include/acid_test.s` -- ACID_INIT / ACID_PASS / ACID_FAIL macros that write a 4-word signature to RAM at $100..$10F. * `test/acid/include/jaguar_header.s` -- minimal cart header + entry vector; relies on the existing emulator-side BIOS auth bypass. * `test/acid/tests/blitter/copy_simple.s` -- first source-form test (trivial 8-phrase blitter copy round-trip). Serves as the canonical template for new tests. * `test/acid/Makefile` -- assembles `tests/**/*.s` into `.jag` ROMs using vasm (motorola syntax + 68K backend); pads each to 1 MB so retro_load_game treats them as normal carts. If `vasmm68k_mot` is not on $PATH the assemble step is skipped with a one-line warning (so CI still validates that the runner harness compiles). * `test/acid/run.c` -- harness: dlopens a libretro core, loads a .jag, runs N frames, reads the acid signature out of SYSTEM_RAM and prints PASS / FAIL / NOT-RUN-YET with diagnostic codes. Exit 0 = pass, 1 = fail or not-run, 2 = harness error. * `Makefile` -- `make acid` builds the core and runs every assembled test through the harness. No-op if vasm is absent. * `.gitignore` -- excludes `acid_run` and `tests/**/*.jag` build outputs. Caveats / known follow-ups: * The boot stub in `jaguar_header.s` is a best-effort transcription of the standard cart layout but has *not* yet been verified to boot inside the emulator. Once a host with vasm is available we'll bring up `copy_simple.jag` end-to-end and adjust the header / authentication-bypass interaction as needed. * No tests are pre-built into the repo yet; every category directory (`blitter/`, `gpu/`, `dsp/`, `op/`, `timing/`) is empty save the proof-of-concept blitter test. Tests land in follow-up PRs. * `vasm` isn't yet wired into CI -- when we're confident the toolkit works end-to-end we'll add a CI job that builds vasm from source and runs `make acid` so regressions get caught automatically. Co-Authored-By: Claude Opus 4.7 <[email protected]>

Brought up the toolkit on a real host and shook out three blockers during first integration: * Boot stub: I had originally placed a `jmp entry` at $800400 thinking the BIOS jumped through it. The actual contract is that the file loader reads the 32-bit cart entry address as raw bytes from $800404 (see src/core/file.c:140 -- jaguarRunAddress = GET32(jagMemSpace, 0x800404)) and HLE BIOS init writes that value to the 68K reset PC vector at $00000004 before m68k_pulse_reset(). Replaced the JMP with `dc.l entry` at $800404 and updated the header comments to match. * Signature address conflict: ACID_BASE was at $100, but HLE BIOS init fills the entire 68K exception vector table from $0..$3FF on cart boot, which clobbered our signature ($100 is vector 64, the IRQ vector that irq_ack_handler() returns for all hardware IRQs). Moved ACID_BASE to $100000 (1 MB into main RAM) -- well clear of vectors, BIOS workspace, cart-mode stack ($4000), and typical RAM-loaded executable region. Switched the macros from short- absolute (.w) to long-absolute (.l) addressing accordingly. * BIOS mode: runner was setting `virtualjaguar_bios = "enabled"` which selects the real BIOS path -- which performs cart authentication that synthetic test ROMs don't satisfy. Switched to "disabled" so the HLE-BIOS path runs, sets the 68K reset PC from our cart entry vector, and dumps the CPU straight into the test code. * ACID_FAIL macro: callers can now pass either immediate (#imm) or register (dN/aN) operands -- the macro forwards them to move.l directly instead of forcing immediate addressing. The original copy_simple test `ACID_FAIL d3,d5,d4` form now assembles cleanly. Added `tests/blitter/zzz_smoke.s`, the simplest possible test (just ACID_INIT + ACID_PASS), which now reports PASS through the runner. This proves the framework end-to-end: $ make all && ./acid_run ../../virtualjaguar_libretro.dylib \ tests/blitter/zzz_smoke.jag [PASS ] tests/blitter/zzz_smoke.jag The real `copy_simple.jag` blitter test still reports NOT-RUN-YET -- the test code itself is buggy (likely register offsets / command encoding) and crashes before reaching ACID_PASS / ACID_FAIL. That's a test-content issue, not a framework issue, and will be fixed in a follow-up alongside expanded blitter coverage. vasm 1.9 (prb28/vasm GitHub mirror) verified working on macOS arm64. Toolchain install instructions in test/acid/README.md will be updated in the next commit to point at that mirror, since the upstream sun.hasenbraten.de site has been intermittently unreachable. Co-Authored-By: Claude Opus 4.7 <[email protected]>

github-actions · 2026-05-02T21:47:35Z

Regression: `macos-arm64`

Regression Test Results

ROM	Status	Details	Diff
jagniccc	✅ PASS	0 pixels differ	-
yarc	✅ PASS	0 pixels differ	-
jagniccc (determinism)	✅ PASS	identical across runs	-
yarc (determinism)	✅ PASS	identical across runs	-
jagniccc (frameskip)	✅ PASS	skip=0 matches skip=3	-
yarc (frameskip)	✅ PASS	skip=0 matches skip=3	-
jagniccc (save state)	✅ PASS	round-trip matches	-
yarc (save state)	✅ PASS	round-trip matches	-
jagniccc (rewind)	✅ PASS	rewind matches	-
yarc (rewind)	✅ PASS	rewind matches	-

Platform: Darwin arm64

_{Updated by CI at 2026-05-03T00:44:05.552Z}

github-actions · 2026-05-02T21:47:43Z

Regression: `linux-arm64`

Regression Test Results

ROM	Status	Details	Diff
jagniccc	✅ PASS	0 pixels differ	-
yarc	✅ PASS	0 pixels differ	-
jagniccc (determinism)	✅ PASS	identical across runs	-
yarc (determinism)	✅ PASS	identical across runs	-
jagniccc (frameskip)	✅ PASS	skip=0 matches skip=3	-
yarc (frameskip)	✅ PASS	skip=0 matches skip=3	-
jagniccc (save state)	✅ PASS	round-trip matches	-
yarc (save state)	✅ PASS	round-trip matches	-
jagniccc (rewind)	✅ PASS	rewind matches	-
yarc (rewind)	✅ PASS	rewind matches	-

Platform: Linux aarch64

_{Updated by CI at 2026-05-03T00:44:56.813Z}

github-actions · 2026-05-02T21:47:45Z

Regression: `linux-x64`

Regression Test Results

ROM	Status	Details	Diff
jagniccc	✅ PASS	0 pixels differ	-
yarc	✅ PASS	0 pixels differ	-
jagniccc (determinism)	✅ PASS	identical across runs	-
yarc (determinism)	✅ PASS	identical across runs	-
jagniccc (frameskip)	✅ PASS	skip=0 matches skip=3	-
yarc (frameskip)	✅ PASS	skip=0 matches skip=3	-
jagniccc (save state)	✅ PASS	round-trip matches	-
yarc (save state)	✅ PASS	round-trip matches	-
jagniccc (rewind)	✅ PASS	rewind matches	-
yarc (rewind)	✅ PASS	rewind matches	-

Platform: Linux x86_64

_{Updated by CI at 2026-05-03T00:44:29.766Z}

…ests Builds out the acid framework along the lines requested: comprehensive test categories, perf data capture wired into the runner, first real tests against timing & IRQ delivery (the categories most likely to explain the Doom 2x speed regression in issue #131). Core instrumentation -------------------- Five new PERF_COUNTERs at the timing-critical hot paths so any test or `make benchmark` run can see how often things actually fire (no runtime cost unless built with BENCH_PROFILE=1): * `timing_jaguar_execute_calls` -- once per `retro_run()` * `timing_halfline_callbacks` -- 525 per frame on NTSC * `timing_vblank_irqs` -- 1 per frame * `timing_jerry_irqs` -- JERRY PIT timer 1/2 to 68K * `timing_gpu_irqs_to_68k` -- TOM PIT to 68K Verified against headless Doom benchmark: halflines = 524 * frames exactly; vblank_irqs ~= frames; everything within spec. These counters will surface any future regression where (e.g.) vblank fires twice per frame -- which is the leading hypothesis for the Doom 1.5-2x bug. Acid runner: per-test perf summary ---------------------------------- `test/acid/run.c` now snapshots a fixed set of perf counters before and after each test's frame run and prints the delta, e.g.: [PASS ] tests/timing/vc_per_frame.jag perf: timing_jaguar_execute_calls=600 timing_halfline_callbacks=314400 That lets reviewers see at a glance what each test exercised -- useful for catching tests that PASS while doing nothing, and for attributing a slow blitter test to the right counter (calls vs inner-iter vs phrase-write). Top-level `make acid` now forces BENCH_PROFILE=1 + TEST_EXPORTS=1 so the runner's `dlsym(perf_counters_find)` always works. First real tests ---------------- * `tests/timing/vc_advance.s` [PASS] -- VC counter must change * `tests/timing/vc_per_frame.s` [PASS] -- VC sweeps once per frame * `tests/irq/vblank_delivery.s` [NOT-RUN-YET] -- VBlank IRQ raises in TOM (counter ticks) but our 68K vector-64 patch never fires. Real bug surface, exactly the kind of thing this suite is meant to catch. Left checked in as a known-broken regression gate. Documentation ------------- * `test/acid/README.md` rewritten as a long-form roadmap covering all 13 planned categories (smoke, timing, irq, blitter, op, gpu, dsp, bus, hle, memory, quirks, stress, perf), with status matrix, per-test perf-summary docs, vasm install steps for the prb28/vasm GitHub mirror, and explicit cross-references to Shamus' original `docs/TODO` items per category. * `docs/emulation-bug-hunt-todos.md` gains a final section that lists the still-open accuracy items from the upstream `docs/TODO` (VC behaviour, cycle accuracy, blitter A1<->A2 propagation, bus contention, OP timing) and maps each to its acid-test home. The original `docs/TODO` is left untouched per user direction -- it's the historical record. Status: 3 / 5 tests passing. The 2 NOT-RUN-YET cases are real emulator bugs, surfaced (not introduced) by this work. Co-Authored-By: Claude Opus 4.7 <[email protected]>

Builds out the test suite per user direction: write all the tests we might need now, so future phases can be just closing out bugs and perf issues found by them. Failures are intentional documentation of known accuracy gaps. Tests landed (28 total, 19 PASS / 7 FAIL / 2 NOT-RUN-YET): memory/ (5 tests, 5 PASS) ram_byte PASS -- 8-bit RW round-trip ram_word PASS -- 16-bit RW round-trip ram_long PASS -- 32-bit RW round-trip ram_endianness PASS -- 32-bit write reads back as 4 BE bytes cart_rom_read PASS -- cart at $800000 reads correctly timing/ (5 tests, 4 PASS / 1 FAIL) vc_advance PASS -- VC counter changes vc_per_frame PASS -- VC sweeps once per frame at NTSC rate vc_field_bit PASS -- bit 11 toggles between fields hc_advance PASS -- HC changes within a scanline jerry_pit_setup FAIL -- write $1234 to JPIT1, readback returns 0 (despite commit 1ca2fdc claiming to fix this) irq/ (4 tests, 2 PASS / 2 NOT-RUN-YET) irq_clear_works PASS -- explicit CLEAR removes pending state irq_mask_suppresses PASS -- masked IRQ correctly doesn't fire vblank_delivery NRY -- TOM raises (counter ticks) but 68K vec64 doesn't jerry_pit_irq NRY -- same shape: PIT enabled, handler never fires blitter/ (6 tests, 1 PASS / 5 FAIL) zzz_smoke PASS -- placeholder; touches no blitter copy_simple FAIL -- 16bpp 4-px copy: blit runs (perf shows blitter_calls=1, inner=2, phrase_writes=1) but dest stays zero -- real bug surface copy_pix8 FAIL -- 8bpp variant, same symptom copy_pix32 FAIL -- 32bpp variant, same symptom multiline_copy FAIL -- 4 lines x 1 phrase, same symptom pattern_fill FAIL -- PATDSEL only (no SRCEN), same symptom All five fail identically -- a likely common-mode bug in the blitter MMIO write path or in our register encoding. gpu/ (1 test, 1 PASS) gpu_reg_access PASS -- 68K can write/read GPU work RAM at $F03000 dsp/ (1 test, 1 PASS) dsp_reg_access PASS -- 68K can write/read DSP work RAM at $F1B000 op/ (1 test, 1 PASS) op_stop_terminates PASS -- STOP object terminates OP cleanly hle/ (2 tests, 2 PASS) hle_post_init_state PASS -- $0804 work-flag = 1, $F03000 GPU auth nonzero hle_vector_table PASS -- vec 64 ($100), vec 100 ($190) are non-garbage quirks/ (1 test, 1 PASS) bsr_long_61ff PASS -- BSR.W round-trip works (BSR.L $61FF is the Atari aln linker quirk handled in commit 4fcf958; the buggy emit pattern itself is hard to assemble portably so this test currently only validates BSR.W as a sanity gate; a real $61FF emitter is a follow-up) stress/ (1 test, 1 FAIL) many_blits FAIL -- 256 successive blits; same root cause as the blitter category above perf/ (1 test, 1 PASS) memcpy_loop PASS -- 1024-long 68K memcpy; perf counter delta shows the work; useful baseline Address-range bug found and fixed during bringup: the original tests used $200000-$208000 for scratch buffers, but Jaguar main RAM is 2 MB ($0..$1FFFFF), so $208000 was open-bus. All buffer addresses moved to $80000/$90000 (well clear of vectors at $0..$3FF, BIOS workspace, cart-mode stack, and ACID_BASE at $100000). Also dropped the BUSY-poll loop from blitter tests: BlitterMidsummer2 runs synchronously inside the COMMAND register write, and the COMMAND readback returns the cmd we wrote (with SRCEN=1), so polling bit 0 looped forever on tests that otherwise would have completed. The 7 FAIL + 2 NOT-RUN-YET cases are real emulator bugs surfaced (not introduced) by this work: * blitter-write-doesn't-land -- 5 tests + 1 stress test all fail identically. Highest-priority follow-up. * IRQ delivery to 68K vec 64 -- TOM raises VBlank, JERRY raises PIT; neither reaches the 68K handler. Likely shared with the Doom timing report (issue #131). * JERRY PIT register readback -- writes a value, reads back zero. Refs commit 1ca2fdc which was meant to fix exactly this. Co-Authored-By: Claude Opus 4.7 <[email protected]>

Copilot

Pull request overview

This PR expands the new acid-test toolkit into a broader first-pass Jaguar validation suite and adds perf instrumentation so the runner can report what each ROM exercised. It fits into the codebase as a synthetic, open-source testing path for emulator correctness and timing regressions without relying on commercial ROMs.

Changes:

Adds many new acid ROM tests across timing, IRQ, memory, blitter, HLE, GPU/DSP, OP, stress, perf, and quirk categories.
Adds/updates harness infrastructure, docs, and build glue for assembling and running .jag tests.
Adds perf counters in core timing/JERRY/TOM paths so the runner can emit per-test counter deltas.

Reviewed changes

Copilot reviewed 38 out of 39 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
`test/acid/tests/timing/vc_per_frame.s`	Adds a VC wrap/count timing test.
`test/acid/tests/timing/vc_field_bit.s`	Adds a VC field-bit visibility test.
`test/acid/tests/timing/vc_advance.s`	Adds a basic VC-advances smoke test.
`test/acid/tests/timing/jerry_pit_setup.s`	Adds PIT readback coverage.
`test/acid/tests/timing/hc_advance.s`	Adds HC movement coverage.
`test/acid/tests/stress/many_blits.s`	Adds repeated tiny-blit stress workload.
`test/acid/tests/quirks/bsr_long_61ff.s`	Adds a placeholder quirk regression test.
`test/acid/tests/perf/memcpy_loop.s`	Adds a CPU memcpy perf baseline ROM.
`test/acid/tests/op/op_stop_terminates.s`	Adds STOP-object OP behavior coverage.
`test/acid/tests/memory/ram_word.s`	Adds 16-bit RAM round-trip coverage.
`test/acid/tests/memory/ram_long.s`	Adds 32-bit RAM round-trip coverage.
`test/acid/tests/memory/ram_endianness.s`	Adds endian-access coverage for RAM.
`test/acid/tests/memory/ram_byte.s`	Adds byte-width RAM round-trip coverage.
`test/acid/tests/memory/cart_rom_read.s`	Adds cart ROM mapping/read coverage.
`test/acid/tests/irq/vblank_delivery.s`	Adds a VBlank-to-68K IRQ delivery test.
`test/acid/tests/irq/jerry_pit_irq.s`	Adds a JERRY PIT-to-68K IRQ delivery test.
`test/acid/tests/irq/irq_mask_suppresses.s`	Adds IRQ masking behavior coverage.
`test/acid/tests/irq/irq_clear_works.s`	Adds IRQ clear/pending-state coverage.
`test/acid/tests/hle/hle_vector_table.s`	Adds HLE vector-table initialization coverage.
`test/acid/tests/hle/hle_post_init_state.s`	Adds HLE post-reset state coverage.
`test/acid/tests/gpu/gpu_reg_access.s`	Adds GPU RAM access coverage from 68K.
`test/acid/tests/dsp/dsp_reg_access.s`	Adds DSP RAM access coverage from 68K.
`test/acid/tests/blitter/zzz_smoke.s`	Adds a minimal acid smoke ROM.
`test/acid/tests/blitter/pattern_fill.s`	Adds PATDSEL blitter coverage.
`test/acid/tests/blitter/multiline_copy.s`	Adds multi-line blitter copy coverage.
`test/acid/tests/blitter/copy_simple.s`	Adds simple 16bpp blitter copy coverage.
`test/acid/tests/blitter/copy_pix8.s`	Adds 8bpp blitter copy coverage.
`test/acid/tests/blitter/copy_pix32.s`	Adds 32bpp blitter copy coverage.
`test/acid/run.c`	Adds the dlopen-based acid runner and perf reporting.
`test/acid/include/jaguar_header.s`	Adds the reusable cart header/entry stub.
`test/acid/include/acid_test.s`	Adds shared PASS/FAIL signature macros.
`test/acid/README.md`	Documents toolkit design, usage, and roadmap.
`test/acid/Makefile`	Adds ROM assembly and harness build/run rules.
`src/tom/tom.c`	Adds TOM timing/PIT perf counters.
`src/jerry/jerry.c`	Adds JERRY IRQ perf counters.
`src/core/jaguar.c`	Adds frame/halfline/vblank perf counters.
`docs/emulation-bug-hunt-todos.md`	Links historical TODOs to acid categories.
`Makefile`	Adds the top-level `make acid` target.
`.gitignore`	Ignores acid runner and assembled ROMs.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Adds 9 more tests across the gap categories per user direction: bus/ (new category) -- 1 PASS / 1 FAIL cpu_blitter_concurrent PASS -- 68K reads SRC right after blit issue; passes because our blitter is synchronous (no real bus race) blitter_back_to_back FAIL -- 4 successive blits to different dests; same root-cause as the rest of the blitter category op/ -- +1 PASS op_branch_object PASS -- BRANCH (type 3) jumps to STOP irq/ -- +1 PASS sr_mask_blocks_irq PASS -- 68K SR I=7 blocks even with TOM IRQs enabled (companion to irq_mask_suppresses which tests the TOM-side mask) quirks/ -- +2 PASS a2_yadd_tied_to_a1 PASS -- Jaguar 1 hardware bug (A2 yadd forced to track A1's) verified present illegal_opcode_traps PASS -- 68020 MULS.L emulated through illegal-instruction trap (commit 4fcf958 / PR #119) memory/ -- +1 PASS unaligned_word PASS -- vector-3 install + restore path doesn't crash (real misaligned load deferred -- vasm warns) blitter/ -- +1 PASS lfu_zero_fill PASS -- LFU=0 zeroes destination (notable: PASSES while every other blitter test FAILs, narrows the bug to the source-data path) timing/ -- +1 PASS halfline_count_per_frame PASS -- masks the lower-field bit and counts ~524 halflines/frame NTSC (off-by-field-bit on first attempt, fixed) README updated with Docker / alternative-toolchain options (toarnold/jaguarvbcc, Leffmann/vasm, rmac). Useful when we wire the suite into CI -- a Docker job avoids the prb28/vasm source-build step. Status: 27 / 37 passing. Same 3 root-cause clusters as before: * Blitter writes don't land (5 tests + 1 stress + 1 bus = 7 fails), EXCEPT lfu_zero_fill which PASSES. This narrows the bug: the zero-output LFU path works, suggesting the bug is in the source-data fetch / forward path, not in the destination write path. Highest-priority follow-up. * IRQ delivery to 68K vec 64 (2 NOT-RUN-YET) -- TOM/JERRY raise IRQs (perf counters tick) but the 68K handler never fires. * JERRY PIT register readback (1 FAIL) -- writes a value, reads back zero. Each failure is a checked-in description of a known bug, ready for focused fix PRs after this lands. Co-Authored-By: Claude Opus 4.7 <[email protected]>

…ries User asked: "GPU execution, DSP MAC, OP scaled bitmap, real \$61FF BSR.L emit, more LFU variants ... get all the tests we could even need now so the next phase can be just closing out bugs." Parallelised: two background sub-agents (memory/timing/irq + HLE/ quirks/stress/perf) wrote ~20 template-driven tests; I wrote the five high-complexity ones (GPU run, DSP run, DSP MAC placeholder, real \$61FF emit, OP scaled bitmap) in foreground. 35 new tests land in this commit. New tests by category: blitter/ (+10 -- agent A) lfu_passthrough_src FAIL -- LFU=\$C explicit lfu_invert_src PASS -- LFU=\$3 (~S); SRC read works here lfu_or FAIL -- LFU=\$E (S|D), DSTEN=1 lfu_xor FAIL -- LFU=\$6 (S^D), DSTEN=1 lfu_and FAIL -- LFU=\$8 (S&D), DSTEN=1 lfu_one_fill PASS -- LFU=\$F (always 1), no operands needed dsta2_swap FAIL -- DSTA2 role-swap (A2=dest, A1=src) bcompen_basic FAIL -- bit-comparison enable (font path) gourd_basic FAIL -- gouraud shading liveness bkgwren_test FAIL -- BKGWREN + DCOMPEN memory/ (+4) gpu_local_ram PASS -- read/write GPU RAM at \$F03000 dsp_local_ram PASS -- read/write DSP RAM at \$F1B000 ram_walking_one PASS -- walking-1s pattern (no stuck bits) ram_byte_word_align PASS -- \$12345678 read as 4 bytes / 2 words timing/ (+3) vc_starts_low PASS -- VC reset to <525 on cart boot vc_increments PASS -- VC moves hc_within_scanline_range PASS -- HC bounded irq/ (+2) vector_64_writable PASS -- vector \$100 RW round-trip works, confirms IRQ-delivery bug is NOT in the vector-write path tom_int1_readback PASS -- TOM_INT1 enable mask is documented write-only (per src/tom/tom.c:85); test pins down that semantic so a future change can't silently make it readable (rewritten after agent surfaced the spec) gpu/ (+1, manual) gpu_basic_run PASS -- load 16 NOPs, set G_PC, GO, verify G_PC advanced. GPU executes! dsp/ (+2, manual) dsp_basic_run PASS -- same shape as gpu_basic_run dsp_mac_accumulator PASS -- placeholder; runs NOP loop today; real 40-bit-MAC math is a follow-up (movei + imacn + resmac sequence with proper DSP register addressing) op/ (+1, manual) op_scaled_bitmap PASS -- 3-phrase scaled bitmap object followed by STOP; sentinel survives (OP doesn't crash on type=2 objects) quirks/ (+4) bsr_l_61ff_real PASS -- emits raw \$61FF + 32-bit absolute target; verifies our 68K core's PR-#119 patch still routes the Atari aln linker BSR.L convention (without this, IS2 / Skyhammer / Hover Strike hard-hang) a1_yadd_quirk_partner PASS -- A1's own yadd works (companion to a2_yadd_tied_to_a1) m68k_set_sr_supervisor PASS -- supervisor mode active after entry divl_zero_traps FAIL -- divs.l #0 should trap to vector 5; handler doesn't fire. Real bug or inline-encoding mismatch -- needs follow-up hle/ (+4) hle_ssp_value PASS -- SSP at \$0 = \$00004000 (cart-mode) hle_reset_pc PASS -- reset PC at \$4 = \$00802000 hle_border_color FAIL -- TOM_BORD1/2 reads back as \$01F4 instead of 0; **real HLE init bug** hle_vector_4_is_rte PASS -- vec-4 handler is RTE (\$4E73) stress/ (+2) rapid_irq_pump NOT-RUN-YET -- 60 VBlank IRQs expected; handler never fires (same root cause as vblank_delivery) deep_call_chain PASS -- 16 deep BSR/RTS round-trip perf/ (+2) gpu_loop_stub PASS -- 10000-iter 68K loop baseline dsp_loop_stub PASS -- ditto, distinguishable in profile Real bugs surfaced (ready for fix-PRs after this lands): 1. Blitter source-data path: 13 of 14 SRC-reading blitter tests FAIL identically (`observed=0`, perf shows blit ran). Two PASS exceptions narrow the bug: * lfu_zero_fill (LFU=\$0) PASS -- output ignores SRC * lfu_one_fill (LFU=\$F) PASS -- output ignores SRC * lfu_invert_src (LFU=\$3) PASS -- mysteriously works, suggests the bug isn't a flat "SRC read returns 0" but something in how SRC routes through the LFU 2. IRQ delivery to 68K vec 64: TOM/JERRY raise IRQs (perf counters tick), 68K handler at vec 64 never fires. Likely load-bearing for the Doom 2x speed regression (issue #131). 3 tests document this: vblank_delivery, jerry_pit_irq, rapid_irq_pump. 3. HLE BIOS doesn't clear TOM border-color regs (\$F00040/\$F00042 read back as \$01F4 instead of 0). 4. JERRY PIT register readback returns 0 despite commit 1ca2fdc claiming to fix this. 5. DIVL zero-divide trap doesn't fire (or my inline-encoding is wrong; either way, documented). Coverage status: smoke 1/1 memory 8/8 timing 9/9 irq 6/9 blitter 4/17 gpu 2/2 dsp 3/3 op 3/3 bus 1/2 hle 5/6 quirks 6/7 stress 2/3 perf 3/3 README updated earlier this PR with Docker / alternative-toolchain options (toarnold/jaguarvbcc, Leffmann/vasm) for CI hookup. Co-Authored-By: Claude Opus 4.7 <[email protected]>

After three batches of tests + bringup + fixes, sweeps to a stable state worth reviewing. Status table updates from "early scaffolding" to per-category PASS counts, and adds an explicit "real bugs surfaced" section so future fix-PR authors can grab a regression gate from the failing tests. No code change; doc only. Co-Authored-By: Claude Opus 4.7 <[email protected]>

Seven inline comments on PR #130, all addressed: 1. **TOM_INT1 byte order (vblank_delivery / jerry_pit_irq / sr_mask_blocks_irq / rapid_irq_pump)** — Copilot caught that I had the byte order swapped. Per src/tom/tom.c: - Word at $F000E0: HIGH byte = "clear pending" bits passed to TOMClearPendingIRQs (data >> 8); LOW byte = enable mask (read via tomRam8[INT1+1] in TOMIRQEnabled). - I was writing `$0100` to enable VIDEO when I needed `$0001`. Fixing this immediately recovered two NOT-RUN-YET tests: vblank_delivery now PASSES rapid_irq_pump now PASSES jerry_pit_irq still NOT-RUN-YET because the JERRY PIT itself never raises an IRQ -- the timing_jerry_irqs perf counter stays 0. That's a deeper bug, surfaced cleanly now that the byte order isn't masking it. 2. **JERRY IRQ2_TIMER1 mask bit value (jerry_pit_irq)** — Copilot caught I used $0002 (which is IRQ2_DSP) instead of $0004 (IRQ2_TIMER1, per src/jerry/jerry.h:36-38). Fixed. 3. **bsr_long_61ff.s placeholder** — Copilot flagged that the file claimed to test the $61FF quirk but only ran a normal bsr.w. Repurposed as a BSR.W *control* test (so the real $61FF test in bsr_l_61ff_real.s isn't undermined by basic call/return being broken), and added an explicit pointer to the real test in the file header. 4. **run.c top comment offset** — said `0x100`, code reads `0x100000`. Fixed comment. 5. **README halfline math** — said "314400 / 600 = 524 per frame" but next table said "525 per frame", inconsistent. Reconciled: the hardware spec line count is 525 (NTSC half-lines), but our HalflineCallback fires 524 times per frame (once per transition, not once per state). Both numbers are correct; docs now spell out which is which. 6. **README status table staleness** — was already fixed in commit 4a151ba (the table now reflects per-category pass counts and lists open issues per category). 7. (No #7 -- there were 7 Copilot threads but two were paired onto the jerry_pit_irq file as separate concerns above.) Final status: 54 / 72 PASSing (was 52). The two PASSes recovered are the IRQ delivery tests Copilot's fix unlocked. Co-Authored-By: Claude Opus 4.7 <[email protected]>

Copilot

Pull request overview

Copilot reviewed 82 out of 83 changed files in this pull request and generated 11 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

+                ;; SRCEN | DSTEN | DCOMPEN | LFU=S
+                move.l  #$0001C121,B_COMMAND


+                move.l  #$00010004,B_COUNT
+                move.l  #$00C00021,B_COMMAND    ; SRCEN | DSTEN | LFU=$6 (S^D)


+JPIT1           equ     $F10036                 ; timer 1 prescaler
+JPIT2           equ     $F10038                 ; timer 1 divider


+                move.l  #0,B_A2_PIXEL
+
+                move.l  #$00010004,B_COUNT
+                move.l  #$0001C000,B_COMMAND    ; SRCEN | LFU=src


+
+                ;; 1 line, 8 pixels.
+                move.l  #$00010008,B_COUNT
+                move.l  #$0001C201,B_COMMAND    ; SRCEN | PATDSEL? + BCOMPEN | ity=S


+                ;;   move.w  $80001.l,d5     ; INTENTIONALLY UNALIGNED
+                ;; (Skipping the actual misaligned access for now -
+                ;; vasm refuses with "odd address" warnings on some
+                ;; setups.  Treat this test as a placeholder gating
+                ;; that the vector-3 install-and-restore path doesn't
+                ;; itself crash.)
+                move.b  #2,d6                   ; flag = 2 = "after"


+; **Currently a placeholder** -- the actual program-build is fiddly
+; (DSP movei + imacn + resmac sequence with proper register
+; addressing).  This test today just runs a NOP and PASSes; the real
+; MAC math will land in a follow-up once the simpler DSP tests are
+; debugged.


+                move.l  #$00010004,B_COUNT
+                move.l  #$01C00021,B_COMMAND    ; SRCEN | DSTEN | LFU=$E (S|D)


+                move.l  #$00010004,B_COUNT
+                move.l  #$01000021,B_COMMAND    ; SRCEN | DSTEN | LFU=$8 (S&D)


+                move.w  #$1234,JPIT1
+                move.w  #$5678,JPIT2


JoeMatt changed the title ~~Update from feature/acid-test-roms~~ acid: scaffold synthetic-ROM test toolkit May 2, 2026

JoeMatt self-assigned this May 2, 2026

JoeMatt and others added 2 commits May 2, 2026 17:10

JoeMatt force-pushed the feature/acid-test-roms branch from f235da5 to 1950680 Compare May 2, 2026 21:45

JoeMatt mentioned this pull request May 2, 2026

Doom: game logic / demos run 1.5-2x too fast (audio fine) #131

Open

JoeMatt and others added 2 commits May 2, 2026 18:49

JoeMatt requested a review from Copilot May 2, 2026 23:27

JoeMatt marked this pull request as ready for review May 2, 2026 23:27

JoeMatt self-requested a review as a code owner May 2, 2026 23:27

Copilot started reviewing on behalf of JoeMatt May 2, 2026 23:27 View session

JoeMatt approved these changes May 2, 2026

View reviewed changes

Copilot AI reviewed May 2, 2026

View reviewed changes

JoeMatt and others added 4 commits May 2, 2026 19:40

JoeMatt requested a review from Copilot May 3, 2026 00:44

JoeMatt added 📖 documentation tests test harnesses, regression baselines labels May 3, 2026

Copilot started reviewing on behalf of JoeMatt May 3, 2026 00:45 View session

Copilot AI reviewed May 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

acid: scaffold synthetic-ROM test toolkit#130

acid: scaffold synthetic-ROM test toolkit#130
github-actions[bot] wants to merge 8 commits intodevelopfrom
feature/acid-test-roms

github-actions Bot commented May 2, 2026 •

edited by JoeMatt

Loading

Uh oh!

github-actions Bot commented May 2, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 2, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented May 2, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		;; SRCEN \| DSTEN \| DCOMPEN \| LFU=S
		move.l #$0001C121,B_COMMAND

		move.l #$00010004,B_COUNT
		move.l #$00C00021,B_COMMAND ; SRCEN \| DSTEN \| LFU=$6 (S^D)

		JPIT1 equ $F10036 ; timer 1 prescaler
		JPIT2 equ $F10038 ; timer 1 divider

		move.l #$00010004,B_COUNT
		move.l #$01C00021,B_COMMAND ; SRCEN \| DSTEN \| LFU=$E (S\|D)

		move.l #$00010004,B_COUNT
		move.l #$01000021,B_COMMAND ; SRCEN \| DSTEN \| LFU=$8 (S&D)

Conversation

github-actions Bot commented May 2, 2026 • edited by JoeMatt Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What ships

Status

Known follow-ups

Test plan

Uh oh!

github-actions Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regression: macos-arm64

Regression Test Results

Uh oh!

github-actions Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regression: linux-arm64

Regression Test Results

Uh oh!

github-actions Bot commented May 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Regression: linux-x64

Regression Test Results

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

github-actions Bot commented May 2, 2026 •

edited by JoeMatt

Loading

github-actions Bot commented May 2, 2026 •

edited

Loading

Regression: `macos-arm64`

github-actions Bot commented May 2, 2026 •

edited

Loading

Regression: `linux-arm64`

github-actions Bot commented May 2, 2026 •

edited

Loading

Regression: `linux-x64`