fix: QPS schedule drift, handle reaping, stats window drift, sampling bias#146
Merged
Merged
Conversation
… bias A batch of smaller correctness cleanups from the original review that weren't in the earlier groups. - benchmark (QPS): the main loop slept a fresh inter-arrival delay each iteration, letting per-iteration work accumulate as drift so the achieved rate fell below target over a long run. Now it advances an absolute `next_send` schedule and sleeps until it (self-correcting), and uses the scheduled time as the request's arrival — so the schedule-slip metric also captures scheduling drift. - benchmark: the QPS and concurrent fixed-count loops pushed one JoinHandle per request into a Vec for the whole run. Now finished handles are reaped periodically so the Vec stays bounded in steady state. (Sustained open-loop overload still accumulates offered-but-unserved requests — inherent to QPS mode.) - stats: windowed rates divided counter deltas by the nominal interval, so a delayed tick distorted the reported rate. Now they divide by the actual elapsed time since the previous window, the interval timer uses MissedTickBehavior::Skip (no catch-up bursts), and the deltas use saturating_sub (the counters are read non-atomically across many statics). - benchmark (dataset): sampling did `take(sample_size)` BEFORE shuffling, biasing the sample to the first N entries of a possibly-sorted file. Now a tested `sample_workloads` helper shuffles the full set (seeded) then truncates, and the loader warns when sample_size exceeds the dataset. New unit tests cover sample_workloads (shuffle-before-truncate, caps at len). The async loop changes (QPS schedule, handle reaping, stats elapsed) are build + reasoning-verified. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A batch of smaller correctness cleanups from the original review that weren't in the earlier groups.
next_sendschedule andsleep_untils it (self-correcting: if the loop runs late, the next sleep is shorter). It also uses the scheduled time as the request'sarrival, so the schedule-slip metric from the coordinated-omission work now captures scheduling drift too.JoinHandleper request into aVecfor the whole run (memory growth on long runs). Now finished handles are reaped periodically (retain(!is_finished)past aconcurrency + 256threshold) so theVecstays bounded in steady state. The end-of-run grace drain is unchanged. Note: under sustained open-loop overload, queued-but-unfinished tasks still accumulate — that's inherent to QPS mode and would need a separate outstanding-request cap (flagged as a follow-up, not regressed by this change).stats.rsdivided window deltas by the nominal interval, so a delayed tick distorted the reported rate. Now it divides by the actual elapsed time since the previous window, setsMissedTickBehavior::Skip(no catch-up bursts), and usessaturating_subfor the deltas (the counters are read non-atomically across many statics).take(sample_size)before shuffling, biasing the sample to the first N entries of a possibly-sorted file. Now a testedsample_workloadshelper shuffles the full set (seeded) then truncates, and the loader warns whensample_sizeexceeds the dataset.Test plan
cargo test— 96 lib + 16 integration tests pass, 0 failures. New tests:sample_workloads(shuffle-before-truncate proven by asserting the sample isn't the first N; caps at len; deterministic for a seed).cargo clippy --all-targets— cleancargo fmt --check— cleanGenerated with Claude Code