STOMP M6: load tests + lived-experience writeup#21
Merged
Conversation
Adds benchmarks/stomp/ — Python STOMP harness (fanout/slow/queue/soak), shell driver with per-phase fresh broker + RSS sampling, fanout cell sweep, and pure-Gem microbenches for deep-copy fanout cost and gen_server.loop survival. Headline finding: the broker dies at exactly 175 frames per connection writer, due to the writer_loop ↔ handle_frame ↔ handle_<command> ↔ writer_loop mutual tail recursion (not direct self-recursion, so not TCO'd) overflowing the 256 KB minicoro stack. Bumps the non-tail-recursion-ceiling roadmap entry from P2 to P1 — first user hit. Working envelope: 25k MESSAGE deliveries at 50–55k msg/s, p100 first-msg latency 35–45 ms at 500 subs. Slow-consumer mailbox growth observed but masked by the recursion cliff. Queue round-robin fair to ratio 0.977 right up to the death point. Microbench shows per-send cost is ~2 µs flat across body sizes 64–1024 B at K up to 800 — fanout cost is in the table structure, not the body bytes. Reproduce: bash benchmarks/stomp/run.sh
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
benchmarks/stomp/— Python STOMP load harness (fanout,slow,queue,soak), driver script with per-phase fresh broker and RSS sampling, a fanout cell sweep, and Gem-level microbenches isolating deep-copy fanout cost andgen_server.loopsurvival.examples/stomp_broker/NOTES.mdwith quantitative findings, a sketch of the backpressure design space, and notes on what the language pushed back on.docs/ROADMAP.mdfrom P2 → P1 — this is the first user hit.Headline finding
The STOMP broker dies at exactly 175 frames per connection writer. Cause:
connection.gem'swriter_loop ↔ handle_frame ↔ handle_<command> ↔ writer_loopis mutual tail recursion across three functions. The compiler only TCO's direct self-recursive tail calls (per the existingmark_process_tailGFP pass), so each cycle eats ~1.4 KB of C stack and minicoro's 256 KB ceiling falls after ~175 cycles. A baregen_server(microbench) handles 500+ casts cleanly, isolating the bug to the broker's writer loop pattern.This dominates milestone 6 — the deep-copy fanout cost and slow-consumer OOM that the tutorial flags are both masked by hitting the recursion cliff first.
Working envelope (below the cliff)
Microbench: deep-copy fanout cost
Per-
sendcost is ~2 µs flat across body sizes 64–1024 B at K up to 800 subscribers. The cost is the table structure, not the body bytes. Body size starts to show only at 4 KB (~3 µs), so small-message workloads pay full per-frame overhead.Test plan
make test— all examples + LSP smoke tests passbash benchmarks/stomp/run.shruns all four phases end-to-endPHASES=fanout_small bash benchmarks/stomp/run.shsmoke testbash benchmarks/stomp/sweep.shreproduces the cliffbuild/gem benchmarks/stomp/microbench_gen_server.gemruns to completion (500 casts, no overflow)build/gem benchmarks/stomp/microbench_fanout_one.gem 500 1024 200produces clean numbers