Skip to content

fix: QPS schedule drift, handle reaping, stats window drift, sampling bias#146

Merged
brayniac merged 1 commit into
iopsystems:mainfrom
brayniac:fix/review-cleanup-batch
Jun 12, 2026
Merged

fix: QPS schedule drift, handle reaping, stats window drift, sampling bias#146
brayniac merged 1 commit into
iopsystems:mainfrom
brayniac:fix/review-cleanup-batch

Conversation

@brayniac

Copy link
Copy Markdown
Contributor

Summary

A batch of smaller correctness cleanups from the original review that weren't in the earlier groups.

  • QPS schedule drift (release: v0.1.5 #10). The QPS main loop slept a fresh inter-arrival delay each iteration, letting per-iteration work accumulate as drift — the achieved rate fell below target over a long run. Now it advances an absolute next_send schedule and sleep_untils it (self-correcting: if the loop runs late, the next sleep is shorter). It also uses the scheduled time as the request's arrival, so the schedule-slip metric from the coordinated-omission work now captures scheduling drift too.
  • JoinHandle accumulation. The QPS and concurrent fixed-count loops pushed one JoinHandle per request into a Vec for the whole run (memory growth on long runs). Now finished handles are reaped periodically (retain(!is_finished) past a concurrency + 256 threshold) so the Vec stays bounded in steady state. The end-of-run grace drain is unchanged. Note: under sustained open-loop overload, queued-but-unfinished tasks still accumulate — that's inherent to QPS mode and would need a separate outstanding-request cap (flagged as a follow-up, not regressed by this change).
  • Windowed-stats interval drift. stats.rs divided window deltas by the nominal interval, so a delayed tick distorted the reported rate. Now it divides by the actual elapsed time since the previous window, sets MissedTickBehavior::Skip (no catch-up bursts), and uses saturating_sub for the deltas (the counters are read non-atomically across many statics).
  • Dataset sampling bias. Sampling did take(sample_size) before shuffling, biasing the sample to the first N entries of a possibly-sorted file. Now a tested sample_workloads helper shuffles the full set (seeded) then truncates, and the loader warns when sample_size exceeds the dataset.

Test plan

  • cargo test — 96 lib + 16 integration tests pass, 0 failures. New tests: sample_workloads (shuffle-before-truncate proven by asserting the sample isn't the first N; caps at len; deterministic for a seed).
  • cargo clippy --all-targets — clean
  • cargo fmt --check — clean
  • Independent review of the async-path changes (QPS schedule self-correction, handle reaping, stats elapsed tracking). It confirmed the schedule and stats tracking correct and the burst-after-stall behavior unchanged vs the old code; its one substantive point — that the handle reap doesn't bound the vec under sustained overload — is inherent to open-loop QPS and is now documented in-code + above.

Generated with Claude Code

… bias

A batch of smaller correctness cleanups from the original review that weren't in
the earlier groups.

- benchmark (QPS): the main loop slept a fresh inter-arrival delay each iteration,
  letting per-iteration work accumulate as drift so the achieved rate fell below
  target over a long run. Now it advances an absolute `next_send` schedule and
  sleeps until it (self-correcting), and uses the scheduled time as the request's
  arrival — so the schedule-slip metric also captures scheduling drift.
- benchmark: the QPS and concurrent fixed-count loops pushed one JoinHandle per
  request into a Vec for the whole run. Now finished handles are reaped
  periodically so the Vec stays bounded in steady state. (Sustained open-loop
  overload still accumulates offered-but-unserved requests — inherent to QPS mode.)
- stats: windowed rates divided counter deltas by the nominal interval, so a
  delayed tick distorted the reported rate. Now they divide by the actual elapsed
  time since the previous window, the interval timer uses MissedTickBehavior::Skip
  (no catch-up bursts), and the deltas use saturating_sub (the counters are read
  non-atomically across many statics).
- benchmark (dataset): sampling did `take(sample_size)` BEFORE shuffling, biasing
  the sample to the first N entries of a possibly-sorted file. Now a tested
  `sample_workloads` helper shuffles the full set (seeded) then truncates, and the
  loader warns when sample_size exceeds the dataset.

New unit tests cover sample_workloads (shuffle-before-truncate, caps at len). The
async loop changes (QPS schedule, handle reaping, stats elapsed) are build +
reasoning-verified.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
@brayniac brayniac merged commit a91bf05 into iopsystems:main Jun 12, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant