Add osprey-stress CLI (closes #324)#367
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds the Changesosprey-stress CLI
Event-stream scrollbar
🎯 3 (Moderate) | ⏱️ ~25 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 3❌ Failed checks (3 warnings)
✅ Passed checks (2 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
f908913 to
f98dd4c
Compare
|
#330 merged, rebased on main, dropped the import shim (
Closes #324. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
osprey_worker/src/osprey/worker/stress/tests/test_cli.py (1)
10-72: ⚡ Quick winAdd parser tests for invalid numeric inputs
Please add cases asserting parse failure for invalid run args (
--rate 0,--rate -1,--events 0,--drain-seconds -1). This will lock in the non-crashing contract forcmd_run.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@osprey_worker/src/osprey/worker/stress/tests/test_cli.py` around lines 10 - 72, Add new test methods to the TestParser class to verify that the parser rejects invalid numeric inputs and raises SystemExit. Create test methods for each invalid case: rate equal to 0, rate as negative value, events equal to 0, and drain-seconds as negative value. Each test should follow the pattern used in test_no_command_requires_one by wrapping build_parser().parse_args() with pytest.raises(SystemExit) and passing the run command with the respective invalid argument to ensure the parser properly validates these numeric constraints.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@osprey_worker/src/osprey/worker/stress/cli.py`:
- Around line 90-97: The CLI arguments `--events`, `--rate`, and
`--drain-seconds` being added with add_argument do not validate that their
values are positive, which causes a ZeroDivisionError when args.rate equals zero
at line 139. Add validation constraints to each argument definition to enforce
that events must be greater than 0, rate must be greater than 0, and
drain-seconds must be greater than or equal to 0. Use the appropriate argument
parameter (such as choices or a custom action/type) to validate these
constraints at parse time rather than at runtime.
---
Nitpick comments:
In `@osprey_worker/src/osprey/worker/stress/tests/test_cli.py`:
- Around line 10-72: Add new test methods to the TestParser class to verify that
the parser rejects invalid numeric inputs and raises SystemExit. Create test
methods for each invalid case: rate equal to 0, rate as negative value, events
equal to 0, and drain-seconds as negative value. Each test should follow the
pattern used in test_no_command_requires_one by wrapping
build_parser().parse_args() with pytest.raises(SystemExit) and passing the run
command with the respective invalid argument to ensure the parser properly
validates these numeric constraints.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 22bb7c6d-c7f0-4ca8-bece-83d660f341fd
📒 Files selected for processing (4)
CHANGELOG.mdosprey_worker/pyproject.tomlosprey_worker/src/osprey/worker/stress/cli.pyosprey_worker/src/osprey/worker/stress/tests/test_cli.py
Addresses CodeRabbit feedback on #367. Previously the parser accepted 0 or negatives for the numeric args; a `--rate 0` would then ZeroDivision when computing `max_runtime_seconds = events / rate + drain_seconds`, which is a worse failure mode than a clear argparse error. Adds three small type-validator helpers (`_positive_int`, `_positive_float`, `_nonneg_float`) that raise `ArgumentTypeError` at parse time, plus a parametrized test covering the seven invalid-input combinations (events ≤ 0, rate ≤ 0, drain-seconds < 0, duration ≤ 0).
|
Addressed CodeRabbit's two notes in Validation (cli.py:97 / 139) — added Parser tests (test_cli.py:10-72) — added a parametrized 13/13 cli tests pass (was 6), ruff + mypy clean. |
ffca77c to
993cde3
Compare
Addresses CodeRabbit feedback on #367. Previously the parser accepted 0 or negatives for the numeric args; a `--rate 0` would then ZeroDivision when computing `max_runtime_seconds = events / rate + drain_seconds`, which is a worse failure mode than a clear argparse error. Adds three small type-validator helpers (`_positive_int`, `_positive_float`, `_nonneg_float`) that raise `ArgumentTypeError` at parse time, plus a parametrized test covering the seven invalid-input combinations (events ≤ 0, rate ≤ 0, drain-seconds < 0, duration ≤ 0).
|
Tested 993cde3, performance seems to degrade with each run. osprey on asc [$!?] via v3.11.11 (osprey)
➜ osprey-stress run --events 10000 --rate 1000 --threshold-drop-rate 0.01 --threshold-p95-ms 500 --report json
[osprey-stress] run_id=af099ace events=10000 rate=1000/s bootstrap=localhost:9092 input=osprey.actions_input output=osprey.execution_results
[osprey-stress] producer done, draining consumer...
{
"consumed": 10000,
"drop_count": 0,
"drop_rate": 0.0,
"duration_seconds": 120.37888309000004,
"latency": {
"count": 10000,
"max_ms": 109213.1917476654,
"min_ms": 88.16361427307129,
"p50_ms": 57644.95253562927,
"p95_ms": 103032.96446800232,
"p99_ms": 108514.60933685303
},
"matched": 10000,
"mode": "closed-loop",
"produced": 10000,
"threshold_breaches": [
"p95_ms 103033.0 > 500.0"
],
"thresholds": {
"drop_rate": 0.01,
"p95_ms": 500.0
}
}
osprey on asc [$!?] via v3.11.11 (osprey) took 2m2s
✗ osprey-stress run --events 10000 --rate 1000 --threshold-drop-rate 0.01 --threshold-p95-ms 500 --report json
[osprey-stress] run_id=d54bcaa9 events=10000 rate=1000/s bootstrap=localhost:9092 input=osprey.actions_input output=osprey.execution_results
[osprey-stress] producer done, draining consumer...
{
"consumed": 9096,
"drop_count": 904,
"drop_rate": 0.0904,
"duration_seconds": 131.1068548410001,
"latency": {
"count": 9096,
"max_ms": 120701.101064682,
"min_ms": 65.55795669555664,
"p50_ms": 62529.24847602844,
"p95_ms": 112622.1079826355,
"p99_ms": 117529.26683425903
},
"matched": 9096,
"mode": "closed-loop",
"produced": 10000,
"threshold_breaches": [
"drop_rate 0.0904 > 0.0100",
"p95_ms 112622.1 > 500.0"
],
"thresholds": {
"drop_rate": 0.01,
"p95_ms": 500.0
}
}
osprey on asc [$!?] via v3.11.11 (osprey) took 2m13s
✗ osprey-stress run --events 10000 --rate 1000 --threshold-drop-rate 0.01 --threshold-p95-ms 500 --report json
[osprey-stress] run_id=8bb804f6 events=10000 rate=1000/s bootstrap=localhost:9092 input=osprey.actions_input output=osprey.execution_results
[osprey-stress] producer done, draining consumer...
{
"consumed": 5829,
"drop_count": 4171,
"drop_rate": 0.4171,
"duration_seconds": 89.49104550900006,
"latency": {
"count": 5829,
"max_ms": 81125.71907043457,
"min_ms": 62.23154067993164,
"p50_ms": 37650.094509124756,
"p95_ms": 77414.13927078247,
"p99_ms": 80067.45100021362
},
"matched": 5829,
"mode": "closed-loop",
"produced": 10000,
"threshold_breaches": [
"drop_rate 0.4171 > 0.0100",
"p95_ms 77414.1 > 500.0"
],
"thresholds": {
"drop_rate": 0.01,
"p95_ms": 500.0
}
}
osprey on asc [$!?] via v3.11.11 (osprey) took 1m31s
✗ osprey-stress run --events 10000 --rate 1000 --threshold-drop-rate 0.01 --threshold-p95-ms 500 --report json
[osprey-stress] run_id=879e6925 events=10000 rate=1000/s bootstrap=localhost:9092 input=osprey.actions_input output=osprey.execution_results
[osprey-stress] producer done, draining consumer...
{
"consumed": 3395,
"drop_count": 6605,
"drop_rate": 0.6605,
"duration_seconds": 58.78333274199986,
"latency": {
"count": 3395,
"max_ms": 49045.26400566101,
"min_ms": 63.27676773071289,
"p50_ms": 24665.666580200195,
"p95_ms": 46235.65363883972,
"p99_ms": 47933.637857437134
},
"matched": 3395,
"mode": "closed-loop",
"produced": 10000,
"threshold_breaches": [
"drop_rate 0.6605 > 0.0100",
"p95_ms 46235.7 > 500.0"
],
"thresholds": {
"drop_rate": 0.01,
"p95_ms": 500.0
}
}I also noticed the scrollbar in the event stream section is barely visible, which makes it impossible to move through events a lot faster. I'm wondering if a verbose output for this might be a good option, rather than having to look at docker compose logs for debug output. |
|
Thank you so much @chimosky for testing! fixes pushed in 16a0659 and 29e03ad. (claude assisted comment):
also fixed the scrollbar thing |
Addresses chimosky's review on #367, where runs got progressively worse across repeated invocations (0% → 9% → 41% → 66% drop rate over 4 runs). The harness was reporting the truth — pipeline saturated, each run leaves backlog the next inherits — but the framing made it look like a harness regression rather than the SUT one. - `probe_topic_head` snapshots an input/output topic's end-offsets via a groupless KafkaConsumer. Tolerant of missing topics (fresh clusters). - Before producing, log baseline heads. After the run, log the deltas; if output_topic advanced by fewer than `--events`, warn that backlog will carry into the next run. - `--verbose` (+ `--verbose-interval-seconds`) emits live stderr lines: produced/target, matched, match_rate, in/out topic deltas. Avoids having to tail `docker compose logs` to know what's happening. - 13 new unit tests cover the verbose-flag parsing, topic-head probing edge cases (missing topic, close failures, partition assignment), and the ProgressReporter (periodic emission, double-start guard, probe-failure tolerance). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Also from chimosky's review on #367: the event-stream scrollbar was "barely visible," which made it slow to skim long result sets. We were relying on the OS auto-hide default (essentially invisible on macOS unless you're actively scrolling). Style the `ReactVirtualized__List` scrolling div with a thin always-visible bar in the existing `--text-light-secondary` / `--background-secondary` theme variables, plus a hover state. WebKit (Chrome/Safari) and Firefox both honored. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
osprey_worker/src/osprey/worker/stress/tests/test_cli.py (1)
188-190: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick winMake threaded progress tests less timing-sensitive.
These checks depend on fixed
sleepwindows, which can intermittently flake on busy CI runners. Prefer deadline-based polling for the expected condition (e.g., emitted line present).Also applies to: 239-240
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@osprey_worker/src/osprey/worker/stress/tests/test_cli.py` around lines 188 - 190, The threaded progress assertions in the stress CLI tests are too dependent on fixed sleep timing and can flake under load. Update the affected test flow around tick-based progress checks to use deadline-based polling for the expected emitted line or condition instead of sleeping for hardcoded intervals, and apply the same change to the other marked section in the test file. Use the existing test helpers and the relevant progress-output assertions to locate the impacted logic.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@osprey_ui/src/components/event_stream/EventStream.module.css`:
- Around line 25-45: The scrollbar styling block in EventStream.module.css uses
CSS Modules :global(...) selectors that Stylelint flags as unknown, so update
the lint setup to understand CSS Modules selectors for this file or add a
narrowly scoped suppression with justification. Use the virtualizedList and
ReactVirtualized__List selector block as the target when applying the fix, and
prefer a config/plugin change over silencing the rule globally.
In `@osprey_worker/src/osprey/worker/stress/cli.py`:
- Around line 86-91: The probe/close exception handlers in the stress CLI are
swallowing failures without any trace, making repeated best-effort probe issues
impossible to diagnose. In the affected try/except blocks around
consumer.close(autocommit=False) in the CLI flow, keep the stress run alive but
replace the bare except/pass with debug logging that records the exception
details and context, then continue. Apply the same fix in both matching handlers
so the behavior is consistent and discoverable without becoming noisy.
- Around line 131-135: Make the initial reporter probes in start() best-effort
like _run() so a transient Kafka failure in --verbose mode does not abort
cmd_run after the consumer has already started. Update the startup path around
_probe_input() and _probe_output() in StressProgressReporter to catch probe
errors and fall back to zero baselines, matching the existing guarded behavior
in _run().
---
Nitpick comments:
In `@osprey_worker/src/osprey/worker/stress/tests/test_cli.py`:
- Around line 188-190: The threaded progress assertions in the stress CLI tests
are too dependent on fixed sleep timing and can flake under load. Update the
affected test flow around tick-based progress checks to use deadline-based
polling for the expected emitted line or condition instead of sleeping for
hardcoded intervals, and apply the same change to the other marked section in
the test file. Use the existing test helpers and the relevant progress-output
assertions to locate the impacted logic.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 7371df68-ba7d-4561-865e-b69d9cfbab75
📒 Files selected for processing (4)
CHANGELOG.mdosprey_ui/src/components/event_stream/EventStream.module.cssosprey_worker/src/osprey/worker/stress/cli.pyosprey_worker/src/osprey/worker/stress/tests/test_cli.py
✅ Files skipped from review due to trivial changes (1)
- CHANGELOG.md
| .virtualizedList :global(.ReactVirtualized__List) { | ||
| scrollbar-width: thin; | ||
| scrollbar-color: var(--text-light-secondary) var(--background-secondary); | ||
| } | ||
|
|
||
| .virtualizedList :global(.ReactVirtualized__List)::-webkit-scrollbar { | ||
| width: 10px; | ||
| height: 10px; | ||
| } | ||
|
|
||
| .virtualizedList :global(.ReactVirtualized__List)::-webkit-scrollbar-track { | ||
| background: var(--background-secondary); | ||
| } | ||
|
|
||
| .virtualizedList :global(.ReactVirtualized__List)::-webkit-scrollbar-thumb { | ||
| background-color: var(--text-light-secondary); | ||
| border-radius: 5px; | ||
| border: 2px solid var(--background-secondary); | ||
| } | ||
|
|
||
| .virtualizedList :global(.ReactVirtualized__List)::-webkit-scrollbar-thumb:hover { |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win
Stylelint currently rejects :global(...) selectors used in this block.
selector-pseudo-class-no-unknown is raised on each :global(...) selector here, so this likely fails CI unless stylelint is configured for CSS Modules syntax. Please add the appropriate stylelint config exception/plugin (preferred) or narrowly-scoped lint suppression with justification.
🧰 Tools
🪛 Stylelint (17.13.0)
[error] 25-25: Unknown pseudo-class selector ":global" (selector-pseudo-class-no-unknown)
(selector-pseudo-class-no-unknown)
[error] 30-30: Unknown pseudo-class selector ":global" (selector-pseudo-class-no-unknown)
(selector-pseudo-class-no-unknown)
[error] 35-35: Unknown pseudo-class selector ":global" (selector-pseudo-class-no-unknown)
(selector-pseudo-class-no-unknown)
[error] 39-39: Unknown pseudo-class selector ":global" (selector-pseudo-class-no-unknown)
(selector-pseudo-class-no-unknown)
[error] 45-45: Unknown pseudo-class selector ":global" (selector-pseudo-class-no-unknown)
(selector-pseudo-class-no-unknown)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@osprey_ui/src/components/event_stream/EventStream.module.css` around lines 25
- 45, The scrollbar styling block in EventStream.module.css uses CSS Modules
:global(...) selectors that Stylelint flags as unknown, so update the lint setup
to understand CSS Modules selectors for this file or add a narrowly scoped
suppression with justification. Use the virtualizedList and
ReactVirtualized__List selector block as the target when applying the fix, and
prefer a config/plugin change over silencing the rule globally.
Source: Linters/SAST tools
| try: | ||
| consumer.close(autocommit=False) | ||
| except Exception: | ||
| # Probe is best-effort. A noisy close() during teardown would mask | ||
| # the snapshot value the caller needs. | ||
| pass |
There was a problem hiding this comment.
📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win
Log swallowed probe exceptions.
Both handlers intentionally keep the stress run alive, but they currently suppress failures with no trace. Use debug logging so repeated probe/close failures are diagnosable without making best-effort probing noisy.
Proposed fix
+import logging
import threading
import time
@@
from kafka import KafkaConsumer
from kafka.structs import TopicPartition
+logger = logging.getLogger(__name__)
+
@@
except Exception:
# Probe is best-effort. A noisy close() during teardown would mask
# the snapshot value the caller needs.
- pass
+ logger.debug('failed to close stress probe consumer', exc_info=True)
@@
except Exception:
# A failed probe (broker hiccup, transient timeout) shouldn't
# tear down the in-flight stress run. Skip the tick.
- pass
+ logger.debug('failed to emit stress progress tick', exc_info=True)As per coding guidelines, “Do not silently swallow exceptions with except Exception: pass without logging or rethrowing.”
Also applies to: 147-152
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@osprey_worker/src/osprey/worker/stress/cli.py` around lines 86 - 91, The
probe/close exception handlers in the stress CLI are swallowing failures without
any trace, making repeated best-effort probe issues impossible to diagnose. In
the affected try/except blocks around consumer.close(autocommit=False) in the
CLI flow, keep the stress run alive but replace the bare except/pass with debug
logging that records the exception details and context, then continue. Apply the
same fix in both matching handlers so the behavior is consistent and
discoverable without becoming noisy.
Source: Coding guidelines
| # Probe baselines on the calling thread so the first periodic emit can | ||
| # diff against them without racing the worker's startup. | ||
| self._initial_input = self._probe_input() | ||
| self._initial_output = self._probe_output() | ||
| self._thread = threading.Thread(target=self._run, name='stress-progress', daemon=True) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Make reporter baseline probes best-effort too.
start() runs the initial probes before the background thread exists, so a transient Kafka probe failure in --verbose mode can abort cmd_run after the consumer has already started. Match _run()’s best-effort behavior and fall back to zero baselines.
Proposed fix
self._start_monotonic = time.monotonic()
# Probe baselines on the calling thread so the first periodic emit can
# diff against them without racing the worker's startup.
- self._initial_input = self._probe_input()
- self._initial_output = self._probe_output()
+ try:
+ self._initial_input = self._probe_input()
+ self._initial_output = self._probe_output()
+ except Exception:
+ logger.debug('failed to capture stress progress baselines', exc_info=True)
+ self._initial_input = 0
+ self._initial_output = 0
self._thread = threading.Thread(target=self._run, name='stress-progress', daemon=True)📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| # Probe baselines on the calling thread so the first periodic emit can | |
| # diff against them without racing the worker's startup. | |
| self._initial_input = self._probe_input() | |
| self._initial_output = self._probe_output() | |
| self._thread = threading.Thread(target=self._run, name='stress-progress', daemon=True) | |
| # Probe baselines on the calling thread so the first periodic emit can | |
| # diff against them without racing the worker's startup. | |
| try: | |
| self._initial_input = self._probe_input() | |
| self._initial_output = self._probe_output() | |
| except Exception: | |
| logger.debug('failed to capture stress progress baselines', exc_info=True) | |
| self._initial_input = 0 | |
| self._initial_output = 0 | |
| self._thread = threading.Thread(target=self._run, name='stress-progress', daemon=True) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@osprey_worker/src/osprey/worker/stress/cli.py` around lines 131 - 135, Make
the initial reporter probes in start() best-effort like _run() so a transient
Kafka failure in --verbose mode does not abort cmd_run after the consumer has
already started. Update the startup path around _probe_input() and
_probe_output() in StressProgressReporter to catch probe errors and fall back to
zero baselines, matching the existing guarded behavior in _run().
| run.add_argument( | ||
| '--verbose', | ||
| action='store_true', | ||
| help='Emit periodic progress lines to stderr (live throughput, topic deltas).', | ||
| ) |
There was a problem hiding this comment.
This would be great as an argument to the main parser, and not just the run sub parser.
So verbose can be applied to both commands.
There was a problem hiding this comment.
addressed your feedback in my latest commits! thank you so much for the review!
|
Reviewed 29e03ad, it'll be great to log errors and exceptions rather than pass on them. |
Orchestrates the stress reporter (#328), producer (#329), and consumer (#330) into a CLI that produces synthetic events at a configurable rate, observes their ExecutionResults, and reports drop rate + p50/p95/p99 latency. Exits non-zero on threshold breach so it can gate CI on pipeline health. Subcommands: * `run` — closed-loop synthetic testing. Blocked on #330; the body lazy-imports the consumer and short-circuits with a clear pointer message until that PR merges, then activates. * `measure` — open-loop measurement against an external source. Stub; will activate once #236 (jetstream input stream plugin) lands. Argparse + threshold gates + report dispatch + entry-point registration are all live today, so the CLI structure (~240 lines) is reviewable independently. When #330 merges, the only change needed here is dropping the `try/except ImportError` wrapper around the consumer import — the orchestration body below it already references Consumer/ConsumerConfig. Adds: * `osprey-stress` console script in `osprey_worker/pyproject.toml` * CHANGELOG entry citing the full stack of PRs the harness landed across * 7 unit tests covering arg parsing, the measure stub, and the #330-blocked-on-import path so a regression in either is loud
Promotes the consumer import to module level alongside producer and reporter, removes the try/except ImportError stub in cmd_run, the matching `# type: ignore[import-untyped]`, and the `TestRunBlocked` test that pinned the stub behaviour. The rest of the CLI (arg parsing, threshold gates, report dispatch, measure stub) already exercised these code paths.
Addresses CodeRabbit feedback on #367. Previously the parser accepted 0 or negatives for the numeric args; a `--rate 0` would then ZeroDivision when computing `max_runtime_seconds = events / rate + drain_seconds`, which is a worse failure mode than a clear argparse error. Adds three small type-validator helpers (`_positive_int`, `_positive_float`, `_nonneg_float`) that raise `ArgumentTypeError` at parse time, plus a parametrized test covering the seven invalid-input combinations (events ≤ 0, rate ≤ 0, drain-seconds < 0, duration ≤ 0).
Addresses chimosky's review on #367, where runs got progressively worse across repeated invocations (0% → 9% → 41% → 66% drop rate over 4 runs). The harness was reporting the truth — pipeline saturated, each run leaves backlog the next inherits — but the framing made it look like a harness regression rather than the SUT one. - `probe_topic_head` snapshots an input/output topic's end-offsets via a groupless KafkaConsumer. Tolerant of missing topics (fresh clusters). - Before producing, log baseline heads. After the run, log the deltas; if output_topic advanced by fewer than `--events`, warn that backlog will carry into the next run. - `--verbose` (+ `--verbose-interval-seconds`) emits live stderr lines: produced/target, matched, match_rate, in/out topic deltas. Avoids having to tail `docker compose logs` to know what's happening. - 13 new unit tests cover the verbose-flag parsing, topic-head probing edge cases (missing topic, close failures, partition assignment), and the ProgressReporter (periodic emission, double-start guard, probe-failure tolerance). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
Also from chimosky's review on #367: the event-stream scrollbar was "barely visible," which made it slow to skim long result sets. We were relying on the OS auto-hide default (essentially invisible on macOS unless you're actively scrolling). Style the `ReactVirtualized__List` scrolling div with a thin always-visible bar in the existing `--text-light-secondary` / `--background-secondary` theme variables, plus a hover state. WebKit (Chrome/Safari) and Firefox both honored. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
CodeRabbit + chimosky review on #367: - The bare except/pass around `consumer.close(autocommit=False)` and inside `ProgressReporter._run()` hid every transient broker hiccup. Replace with `logger.debug(..., exc_info=True)` so repeated failures are diagnosable without becoming user-visible noise. - `ProgressReporter.start()` called `_probe_input()` and `_probe_output()` directly on the calling thread; a transient probe failure there aborted `cmd_run` even though the consumer was already running. Wrap each in try/except with a 0 baseline fallback to match the guarded behavior `_run()` already had. - The two threaded ProgressReporter tests used fixed time.sleep() windows to wait for tick output, which flakes on busy CI runners. Replace with a `_wait_until(predicate, deadline=5s)` helper that polls the sink and waits for the specific state the test cares about (in_topic_Δ value, non-empty sink) before stopping. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
29e03ad to
4b9ea61
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
osprey_ui/src/components/event_stream/EventStream.module.css (1)
21-48: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick winStylelint still flags valid CSS Modules
:global(...)selectors.This file uses the
.module.cssextension, confirming it is a CSS Module. The:global(...)pseudo-class is standard CSS Modules syntax for targeting external class names likeReactVirtualized__List, but Stylelint'sselector-pseudo-class-no-unknownrule does not recognize it without additional configuration. This will break CI if stylelint runs in the pipeline.Please add
stylelint-config-css-modules(or equivalent) to the project's stylelint configuration, or add a narrowly-scoped disable comment with justification if a project-wide fix is not viable right now.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@osprey_ui/src/components/event_stream/EventStream.module.css` around lines 21 - 48, Stylelint is incorrectly flagging the CSS Modules :global(...) selectors used in EventStream.module.css. Update the project’s Stylelint setup to recognize CSS Modules syntax by adding stylelint-config-css-modules (or equivalent) to the configuration used for selector-pseudo-class-no-unknown, so the ReactVirtualized__List rules in this module pass CI. If a global config change is not feasible, add a narrowly scoped disable comment around the affected selectors with a clear justification.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@osprey_ui/src/components/event_stream/EventStream.module.css`:
- Around line 21-48: Stylelint is incorrectly flagging the CSS Modules
:global(...) selectors used in EventStream.module.css. Update the project’s
Stylelint setup to recognize CSS Modules syntax by adding
stylelint-config-css-modules (or equivalent) to the configuration used for
selector-pseudo-class-no-unknown, so the ReactVirtualized__List rules in this
module pass CI. If a global config change is not feasible, add a narrowly scoped
disable comment around the affected selectors with a clear justification.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: 656ad55b-2a08-44a1-b00f-d85c35cbd7f7
📒 Files selected for processing (3)
CHANGELOG.mdosprey_ui/src/components/event_stream/EventStream.module.cssosprey_worker/pyproject.toml
✅ Files skipped from review due to trivial changes (1)
- CHANGELOG.md
🚧 Files skipped from review as they are similar to previous changes (1)
- osprey_worker/pyproject.toml
Summary
Wires the stress reporter (#328), producer (#329), and consumer (#330) into a CLI that produces synthetic events at a configurable rate, observes their
ExecutionResults, and reports drop rate + p50/p95/p99 latency. Exits non-zero on threshold breach so it can gate CI on pipeline health.Closes #324.
What's in this PR
osprey_worker/src/osprey/worker/stress/cli.pyProgressReporterfor--verboseruns +measurestubosprey_worker/src/osprey/worker/stress/tests/test_cli.pyosprey_worker/pyproject.tomlosprey-stress = "osprey.worker.stress.cli:main"console scriptosprey_ui/src/components/event_stream/EventStream.module.cssCHANGELOG.mdTest plan
pytest osprey_worker/src/osprey/worker/stress/tests/test_cli.py— 26/26 pass locally in the docker test runneruv run ruff check osprey_worker/src/osprey/worker/stress/— cleanuv run mypy osprey_worker/src/osprey/worker/stress/— cleanProgressReportertests rewritten with deadline-based polling instead of fixedtime.sleep()windows to avoid CI flakesReview feedback addressed
except/passaroundconsumer.close(autocommit=False)and insideProgressReporter._run()now log atdebugwithexc_info=Trueso repeated transient failures are diagnosable.ProgressReporter.start(): the initial input/output probes are now guarded the same way_run()is — a transient broker failure falls back to a zero baseline rather than abortingcmd_runafter the consumer has already started.ProgressReportertests no longer sleep on fixed wall-clock windows; they use a_wait_until(predicate, deadline=5s)helper that polls for the specific output state under test.:global(...): no stylelint configured inosprey_ui(only ESLint + Prettier), so there's no warning to suppress; Prettier is happy.Stack reminder
GetActionId()UDF🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes