Add osprey-stress CLI (closes #324) by julietshen · Pull Request #367 · roostorg/osprey

julietshen · 2026-06-17T13:28:41Z

Summary

Wires the stress reporter (#328), producer (#329), and consumer (#330) into a CLI that produces synthetic events at a configurable rate, observes their ExecutionResults, and reports drop rate + p50/p95/p99 latency. Exits non-zero on threshold breach so it can gate CI on pipeline health.

osprey-stress run \
  --events 10000 --rate 1000 \
  --threshold-drop-rate 0.01 --threshold-p95-ms 500 \
  --report json

Closes #324.

What's in this PR


`osprey_worker/src/osprey/worker/stress/cli.py`	argparse + orchestration + `ProgressReporter` for `--verbose` runs + `measure` stub
`osprey_worker/src/osprey/worker/stress/tests/test_cli.py`	26 tests: arg parsing, topic-head probes, ProgressReporter behavior under failure
`osprey_worker/pyproject.toml`	adds `osprey-stress = "osprey.worker.stress.cli:main"` console script
`osprey_ui/src/components/event_stream/EventStream.module.css`	forces a visible scrollbar on the virtualized list (surfaced during dogfooding)
`CHANGELOG.md`	new-features entry citing the four parent PRs

Test plan

pytest osprey_worker/src/osprey/worker/stress/tests/test_cli.py — 26/26 pass locally in the docker test runner
uv run ruff check osprey_worker/src/osprey/worker/stress/ — clean
uv run mypy osprey_worker/src/osprey/worker/stress/ — clean
Threaded ProgressReporter tests rewritten with deadline-based polling instead of fixed time.sleep() windows to avoid CI flakes
CI green (pending push)

Review feedback addressed

CodeRabbit + @chimosky on probe close: bare except/pass around consumer.close(autocommit=False) and inside ProgressReporter._run() now log at debug with exc_info=True so repeated transient failures are diagnosable.
CodeRabbit on ProgressReporter.start(): the initial input/output probes are now guarded the same way _run() is — a transient broker failure falls back to a zero baseline rather than aborting cmd_run after the consumer has already started.
CodeRabbit on test flakiness: the two threaded ProgressReporter tests no longer sleep on fixed wall-clock windows; they use a _wait_until(predicate, deadline=5s) helper that polls for the specific output state under test.
CodeRabbit on CSS Modules :global(...): no stylelint configured in osprey_ui (only ESLint + Prettier), so there's no warning to suppress; Prettier is happy.

Stack reminder

✅ Add GetActionId() stdlib UDF + ExecutionContext.get_action_id() #327 — GetActionId() UDF
✅ Add stress.reporter — pure stats for the upcoming stress harness #328 — reporter (pure stats)
✅ Add stress.producer — synthetic Kafka producer for the stress harness #329 — producer
✅ Add stress.consumer — Kafka result consumer for the stress harness #330 — consumer
🟢 this PR — CLI + entry point + CHANGELOG

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Added a new stress-testing CLI that can generate synthetic events at a configurable rate, track delivery progress, report latency percentiles and drop rate, and return a non-zero exit code when thresholds are exceeded.
- Added optional verbose progress updates during longer runs.
Bug Fixes
- Improved the event stream list so the scrollbar is always visible, making scrolling more discoverable and consistent across systems.

coderabbitai · 2026-06-17T13:28:51Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds the osprey-stress CLI with a run subcommand that probes Kafka topic heads, runs a filtered consumer and producer flow, reports latency and drop metrics, and exits non-zero on threshold failure. The UI event-stream list also now forces a visible scrollbar.

Changes

osprey-stress CLI

Layer / File(s)	Summary
CLI contracts and parser `osprey_worker/src/osprey/worker/stress/cli.py`	Defines the stress CLI constants, topic snapshot type, topic probing helper, progress reporter, argument validators, Kafka and threshold option wiring, parser subcommands, and bootstrap server parsing.
Run command and dispatch `osprey_worker/src/osprey/worker/stress/cli.py`	Implements `cmd_run` orchestration with pre/post topic-head probing, consumer start and drain, optional progress reporting, report computation and threshold-based exit codes, keeps `cmd_measure` as a stub, updates `main()` dispatch, and adds the module entry guard.
Tests, script entry point, and changelog `osprey_worker/src/osprey/worker/stress/tests/test_cli.py`, `osprey_worker/pyproject.toml`, `CHANGELOG.md`	Adds parser and helper tests for the stress CLI, registers the `osprey-stress` console script, and documents the new CLI in the changelog.

Event-stream scrollbar

Layer / File(s)	Summary
Scrollbar styling `osprey_ui/src/components/event_stream/EventStream.module.css`, `CHANGELOG.md`	Adds Firefox and WebKit scrollbar styling for the event-stream virtualized list and records the visible-scrollbar behavior in the changelog.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (3 warnings)

Check name	Status	Explanation	Resolution
Linked Issues check	⚠️ Warning	The CLI covers installability, configurable rate/duration, thresholds, and latency reporting, but it does not show the full per-stage drop breakdown requested by `#324`.	Add burst-size support and per-stage accounting for ack, drain, effects, and storage, then emit the requested structured rate-of-success report.
Out of Scope Changes check	⚠️ Warning	The PR includes an unrelated UI scrollbar visibility fix in `osprey_ui`, which is outside the stress-CLI scope.	Split the scrollbar/CSS change into a separate PR unless it is explicitly required for the stress-CLI work.
Docstring Coverage	⚠️ Warning	Docstring coverage is 5.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title is concise and accurately describes the main addition, the new `osprey-stress` CLI.

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch add-stress-cli

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

julietshen · 2026-06-17T13:58:22Z

#330 merged, rebased on main, dropped the import shim (try/except ImportError + # type: ignore[import-untyped] + TestRunBlocked). Net -32 lines.

Consumer / ConsumerConfig import is now at module level alongside Producer and Report.
cmd_run is fully wired end-to-end and unblocked.
osprey-stress --help works after uv sync --dev.
6/6 cli tests pass, ruff + mypy clean.
Marked ready-for-review.

Closes #324.

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

osprey_worker/src/osprey/worker/stress/tests/test_cli.py (1)

10-72: ⚡ Quick win

Add parser tests for invalid numeric inputs

Please add cases asserting parse failure for invalid run args (--rate 0, --rate -1, --events 0, --drain-seconds -1). This will lock in the non-crashing contract for cmd_run.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@osprey_worker/src/osprey/worker/stress/tests/test_cli.py` around lines 10 -
72, Add new test methods to the TestParser class to verify that the parser
rejects invalid numeric inputs and raises SystemExit. Create test methods for
each invalid case: rate equal to 0, rate as negative value, events equal to 0,
and drain-seconds as negative value. Each test should follow the pattern used in
test_no_command_requires_one by wrapping build_parser().parse_args() with
pytest.raises(SystemExit) and passing the run command with the respective
invalid argument to ensure the parser properly validates these numeric
constraints.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@osprey_worker/src/osprey/worker/stress/cli.py`:
- Around line 90-97: The CLI arguments `--events`, `--rate`, and
`--drain-seconds` being added with add_argument do not validate that their
values are positive, which causes a ZeroDivisionError when args.rate equals zero
at line 139. Add validation constraints to each argument definition to enforce
that events must be greater than 0, rate must be greater than 0, and
drain-seconds must be greater than or equal to 0. Use the appropriate argument
parameter (such as choices or a custom action/type) to validate these
constraints at parse time rather than at runtime.

---

Nitpick comments:
In `@osprey_worker/src/osprey/worker/stress/tests/test_cli.py`:
- Around line 10-72: Add new test methods to the TestParser class to verify that
the parser rejects invalid numeric inputs and raises SystemExit. Create test
methods for each invalid case: rate equal to 0, rate as negative value, events
equal to 0, and drain-seconds as negative value. Each test should follow the
pattern used in test_no_command_requires_one by wrapping
build_parser().parse_args() with pytest.raises(SystemExit) and passing the run
command with the respective invalid argument to ensure the parser properly
validates these numeric constraints.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 22bb7c6d-c7f0-4ca8-bece-83d660f341fd

📥 Commits

Reviewing files that changed from the base of the PR and between 6eba08e and f98dd4c.

📒 Files selected for processing (4)

CHANGELOG.md
osprey_worker/pyproject.toml
osprey_worker/src/osprey/worker/stress/cli.py
osprey_worker/src/osprey/worker/stress/tests/test_cli.py

Addresses CodeRabbit feedback on #367. Previously the parser accepted 0 or negatives for the numeric args; a `--rate 0` would then ZeroDivision when computing `max_runtime_seconds = events / rate + drain_seconds`, which is a worse failure mode than a clear argparse error. Adds three small type-validator helpers (`_positive_int`, `_positive_float`, `_nonneg_float`) that raise `ArgumentTypeError` at parse time, plus a parametrized test covering the seven invalid-input combinations (events ≤ 0, rate ≤ 0, drain-seconds < 0, duration ≤ 0).

julietshen · 2026-06-17T16:35:11Z

Addressed CodeRabbit's two notes in ffca77c:

Validation (cli.py:97 / 139) — added _positive_int, _positive_float, _nonneg_float helpers and wired them as the type= for --events, --rate, --drain-seconds, and the measure subcommand's --duration. They raise ArgumentTypeError at parse time, so users get a clean argparse error message instead of a ZeroDivisionError later in max_runtime_seconds = events / rate + drain_seconds.

Parser tests (test_cli.py:10-72) — added a parametrized test_rejects_non_positive_numeric_args covering the seven invalid-input combinations (events ≤ 0, rate ≤ 0, drain-seconds < 0, duration ≤ 0). Each asserts SystemExit (argparse's behavior on type-validation failure).

13/13 cli tests pass (was 6), ruff + mypy clean.

Addresses CodeRabbit feedback on #367. Previously the parser accepted 0 or negatives for the numeric args; a `--rate 0` would then ZeroDivision when computing `max_runtime_seconds = events / rate + drain_seconds`, which is a worse failure mode than a clear argparse error. Adds three small type-validator helpers (`_positive_int`, `_positive_float`, `_nonneg_float`) that raise `ArgumentTypeError` at parse time, plus a parametrized test covering the seven invalid-input combinations (events ≤ 0, rate ≤ 0, drain-seconds < 0, duration ≤ 0).

chimosky · 2026-06-18T21:13:45Z

Tested 993cde3, performance seems to degrade with each run.

osprey on  asc [$!?] via  v3.11.11 (osprey) 
 ➜   osprey-stress run --events 10000 --rate 1000 --threshold-drop-rate 0.01 --threshold-p95-ms 500 --report json
[osprey-stress] run_id=af099ace events=10000 rate=1000/s bootstrap=localhost:9092 input=osprey.actions_input output=osprey.execution_results
[osprey-stress] producer done, draining consumer...
{
  "consumed": 10000,
  "drop_count": 0,
  "drop_rate": 0.0,
  "duration_seconds": 120.37888309000004,
  "latency": {
    "count": 10000,
    "max_ms": 109213.1917476654,
    "min_ms": 88.16361427307129,
    "p50_ms": 57644.95253562927,
    "p95_ms": 103032.96446800232,
    "p99_ms": 108514.60933685303
  },
  "matched": 10000,
  "mode": "closed-loop",
  "produced": 10000,
  "threshold_breaches": [
    "p95_ms 103033.0 > 500.0"
  ],
  "thresholds": {
    "drop_rate": 0.01,
    "p95_ms": 500.0
  }
}

osprey on  asc [$!?] via  v3.11.11 (osprey) took 2m2s 
✗ osprey-stress run --events 10000 --rate 1000 --threshold-drop-rate 0.01 --threshold-p95-ms 500 --report json
[osprey-stress] run_id=d54bcaa9 events=10000 rate=1000/s bootstrap=localhost:9092 input=osprey.actions_input output=osprey.execution_results
[osprey-stress] producer done, draining consumer...
{
  "consumed": 9096,
  "drop_count": 904,
  "drop_rate": 0.0904,
  "duration_seconds": 131.1068548410001,
  "latency": {
    "count": 9096,
    "max_ms": 120701.101064682,
    "min_ms": 65.55795669555664,
    "p50_ms": 62529.24847602844,
    "p95_ms": 112622.1079826355,
    "p99_ms": 117529.26683425903
  },
  "matched": 9096,
  "mode": "closed-loop",
  "produced": 10000,
  "threshold_breaches": [
    "drop_rate 0.0904 > 0.0100",
    "p95_ms 112622.1 > 500.0"
  ],
  "thresholds": {
    "drop_rate": 0.01,
    "p95_ms": 500.0
  }
}

osprey on  asc [$!?] via  v3.11.11 (osprey) took 2m13s 
✗ osprey-stress run --events 10000 --rate 1000 --threshold-drop-rate 0.01 --threshold-p95-ms 500 --report json
[osprey-stress] run_id=8bb804f6 events=10000 rate=1000/s bootstrap=localhost:9092 input=osprey.actions_input output=osprey.execution_results
[osprey-stress] producer done, draining consumer...
{
  "consumed": 5829,
  "drop_count": 4171,
  "drop_rate": 0.4171,
  "duration_seconds": 89.49104550900006,
  "latency": {
    "count": 5829,
    "max_ms": 81125.71907043457,
    "min_ms": 62.23154067993164,
    "p50_ms": 37650.094509124756,
    "p95_ms": 77414.13927078247,
    "p99_ms": 80067.45100021362
  },
  "matched": 5829,
  "mode": "closed-loop",
  "produced": 10000,
  "threshold_breaches": [
    "drop_rate 0.4171 > 0.0100",
    "p95_ms 77414.1 > 500.0"
  ],
  "thresholds": {
    "drop_rate": 0.01,
    "p95_ms": 500.0
  }
}

osprey on  asc [$!?] via  v3.11.11 (osprey) took 1m31s 
✗ osprey-stress run --events 10000 --rate 1000 --threshold-drop-rate 0.01 --threshold-p95-ms 500 --report json
[osprey-stress] run_id=879e6925 events=10000 rate=1000/s bootstrap=localhost:9092 input=osprey.actions_input output=osprey.execution_results
[osprey-stress] producer done, draining consumer...
{
  "consumed": 3395,
  "drop_count": 6605,
  "drop_rate": 0.6605,
  "duration_seconds": 58.78333274199986,
  "latency": {
    "count": 3395,
    "max_ms": 49045.26400566101,
    "min_ms": 63.27676773071289,
    "p50_ms": 24665.666580200195,
    "p95_ms": 46235.65363883972,
    "p99_ms": 47933.637857437134
  },
  "matched": 3395,
  "mode": "closed-loop",
  "produced": 10000,
  "threshold_breaches": [
    "drop_rate 0.6605 > 0.0100",
    "p95_ms 46235.7 > 500.0"
  ],
  "thresholds": {
    "drop_rate": 0.01,
    "p95_ms": 500.0
  }
}

I also noticed the scrollbar in the event stream section is barely visible, which makes it impossible to move through events a lot faster.

I'm wondering if a verbose output for this might be a good option, rather than having to look at docker compose logs for debug output.

julietshen · 2026-06-24T16:17:52Z

Thank you so much @chimosky for testing! fixes pushed in 16a0659 and 29e03ad.

(claude assisted comment):
for performance degradation, the harness was reporting the truth about a saturated pipeline, but the output didn't make that clear, so it read like a harness regression. Reading your numbers, p95 latencies are tens of seconds and the worker is processing ~83 events/sec vs the 1000/sec we're asking for; each run leaves ~7-9k events backed up in the inputz topic that the next run inherits, so things compound.

added a pre-run baseline + post-run delta on the input/output topic heads. If output_topic advanced by fewer events than we sent, the CLI now logs a warning that the next run will start with backlog.
added a --verbose flag that emits periodic stderr lines: t=Xs produced=Y/N matched=M match_rate=R/s in_topic_Δ=A out_topic_Δ=B. The new ProgressReporter is decoupled from Producer/Consumer and factory-injected so it's unit-testable without a Kafka cluster.

also fixed the scrollbar thing

Addresses chimosky's review on #367, where runs got progressively worse across repeated invocations (0% → 9% → 41% → 66% drop rate over 4 runs). The harness was reporting the truth — pipeline saturated, each run leaves backlog the next inherits — but the framing made it look like a harness regression rather than the SUT one. - `probe_topic_head` snapshots an input/output topic's end-offsets via a groupless KafkaConsumer. Tolerant of missing topics (fresh clusters). - Before producing, log baseline heads. After the run, log the deltas; if output_topic advanced by fewer than `--events`, warn that backlog will carry into the next run. - `--verbose` (+ `--verbose-interval-seconds`) emits live stderr lines: produced/target, matched, match_rate, in/out topic deltas. Avoids having to tail `docker compose logs` to know what's happening. - 13 new unit tests cover the verbose-flag parsing, topic-head probing edge cases (missing topic, close failures, partition assignment), and the ProgressReporter (periodic emission, double-start guard, probe-failure tolerance). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Also from chimosky's review on #367: the event-stream scrollbar was "barely visible," which made it slow to skim long result sets. We were relying on the OS auto-hide default (essentially invisible on macOS unless you're actively scrolling). Style the `ReactVirtualized__List` scrolling div with a thin always-visible bar in the existing `--text-light-secondary` / `--background-secondary` theme variables, plus a hover state. WebKit (Chrome/Safari) and Firefox both honored. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

coderabbitai

Actionable comments posted: 3

🧹 Nitpick comments (1)

osprey_worker/src/osprey/worker/stress/tests/test_cli.py (1)
188-190: 🩺 Stability & Availability | 🔵 Trivial | ⚡ Quick win

Make threaded progress tests less timing-sensitive.

These checks depend on fixed sleep windows, which can intermittently flake on busy CI runners. Prefer deadline-based polling for the expected condition (e.g., emitted line present).

Also applies to: 239-240
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@osprey_worker/src/osprey/worker/stress/tests/test_cli.py` around lines 188 -
190, The threaded progress assertions in the stress CLI tests are too dependent
on fixed sleep timing and can flake under load. Update the affected test flow
around tick-based progress checks to use deadline-based polling for the expected
emitted line or condition instead of sleeping for hardcoded intervals, and apply
the same change to the other marked section in the test file. Use the existing
test helpers and the relevant progress-output assertions to locate the impacted
logic.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@osprey_ui/src/components/event_stream/EventStream.module.css`:
- Around line 25-45: The scrollbar styling block in EventStream.module.css uses
CSS Modules :global(...) selectors that Stylelint flags as unknown, so update
the lint setup to understand CSS Modules selectors for this file or add a
narrowly scoped suppression with justification. Use the virtualizedList and
ReactVirtualized__List selector block as the target when applying the fix, and
prefer a config/plugin change over silencing the rule globally.

In `@osprey_worker/src/osprey/worker/stress/cli.py`:
- Around line 86-91: The probe/close exception handlers in the stress CLI are
swallowing failures without any trace, making repeated best-effort probe issues
impossible to diagnose. In the affected try/except blocks around
consumer.close(autocommit=False) in the CLI flow, keep the stress run alive but
replace the bare except/pass with debug logging that records the exception
details and context, then continue. Apply the same fix in both matching handlers
so the behavior is consistent and discoverable without becoming noisy.
- Around line 131-135: Make the initial reporter probes in start() best-effort
like _run() so a transient Kafka failure in --verbose mode does not abort
cmd_run after the consumer has already started. Update the startup path around
_probe_input() and _probe_output() in StressProgressReporter to catch probe
errors and fall back to zero baselines, matching the existing guarded behavior
in _run().

---

Nitpick comments:
In `@osprey_worker/src/osprey/worker/stress/tests/test_cli.py`:
- Around line 188-190: The threaded progress assertions in the stress CLI tests
are too dependent on fixed sleep timing and can flake under load. Update the
affected test flow around tick-based progress checks to use deadline-based
polling for the expected emitted line or condition instead of sleeping for
hardcoded intervals, and apply the same change to the other marked section in
the test file. Use the existing test helpers and the relevant progress-output
assertions to locate the impacted logic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 7371df68-ba7d-4561-865e-b69d9cfbab75

📥 Commits

Reviewing files that changed from the base of the PR and between 993cde3 and 29e03ad.

📒 Files selected for processing (4)

CHANGELOG.md
osprey_ui/src/components/event_stream/EventStream.module.css
osprey_worker/src/osprey/worker/stress/cli.py
osprey_worker/src/osprey/worker/stress/tests/test_cli.py

✅ Files skipped from review due to trivial changes (1)

CHANGELOG.md

coderabbitai · 2026-06-24T16:25:25Z

+.virtualizedList :global(.ReactVirtualized__List) {
+  scrollbar-width: thin;
+  scrollbar-color: var(--text-light-secondary) var(--background-secondary);
+}
+
+.virtualizedList :global(.ReactVirtualized__List)::-webkit-scrollbar {
+  width: 10px;
+  height: 10px;
+}
+
+.virtualizedList :global(.ReactVirtualized__List)::-webkit-scrollbar-track {
+  background: var(--background-secondary);
+}
+
+.virtualizedList :global(.ReactVirtualized__List)::-webkit-scrollbar-thumb {
+  background-color: var(--text-light-secondary);
+  border-radius: 5px;
+  border: 2px solid var(--background-secondary);
+}
+
+.virtualizedList :global(.ReactVirtualized__List)::-webkit-scrollbar-thumb:hover {


📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Stylelint currently rejects :global(...) selectors used in this block.

selector-pseudo-class-no-unknown is raised on each :global(...) selector here, so this likely fails CI unless stylelint is configured for CSS Modules syntax. Please add the appropriate stylelint config exception/plugin (preferred) or narrowly-scoped lint suppression with justification.

🧰 Tools

🪛 Stylelint (17.13.0)

[error] 25-25: Unknown pseudo-class selector ":global" (selector-pseudo-class-no-unknown)

(selector-pseudo-class-no-unknown)

[error] 30-30: Unknown pseudo-class selector ":global" (selector-pseudo-class-no-unknown)

(selector-pseudo-class-no-unknown)

[error] 35-35: Unknown pseudo-class selector ":global" (selector-pseudo-class-no-unknown)

(selector-pseudo-class-no-unknown)

[error] 39-39: Unknown pseudo-class selector ":global" (selector-pseudo-class-no-unknown)

(selector-pseudo-class-no-unknown)

[error] 45-45: Unknown pseudo-class selector ":global" (selector-pseudo-class-no-unknown)

(selector-pseudo-class-no-unknown)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@osprey_ui/src/components/event_stream/EventStream.module.css` around lines 25 - 45, The scrollbar styling block in EventStream.module.css uses CSS Modules :global(...) selectors that Stylelint flags as unknown, so update the lint setup to understand CSS Modules selectors for this file or add a narrowly scoped suppression with justification. Use the virtualizedList and ReactVirtualized__List selector block as the target when applying the fix, and prefer a config/plugin change over silencing the rule globally.

Source: Linters/SAST tools

coderabbitai · 2026-06-24T16:25:25Z

+        try:
+            consumer.close(autocommit=False)
+        except Exception:
+            # Probe is best-effort. A noisy close() during teardown would mask
+            # the snapshot value the caller needs.
+            pass


📐 Maintainability & Code Quality | 🟡 Minor | ⚡ Quick win

Log swallowed probe exceptions.

Both handlers intentionally keep the stress run alive, but they currently suppress failures with no trace. Use debug logging so repeated probe/close failures are diagnosable without making best-effort probing noisy.

Proposed fix

+import logging import threading import time @@ from kafka import KafkaConsumer from kafka.structs import TopicPartition +logger = logging.getLogger(__name__) + @@ except Exception: # Probe is best-effort. A noisy close() during teardown would mask # the snapshot value the caller needs. - pass + logger.debug('failed to close stress probe consumer', exc_info=True) @@ except Exception: # A failed probe (broker hiccup, transient timeout) shouldn't # tear down the in-flight stress run. Skip the tick. - pass + logger.debug('failed to emit stress progress tick', exc_info=True)

As per coding guidelines, “Do not silently swallow exceptions with except Exception: pass without logging or rethrowing.”

Also applies to: 147-152

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@osprey_worker/src/osprey/worker/stress/cli.py` around lines 86 - 91, The probe/close exception handlers in the stress CLI are swallowing failures without any trace, making repeated best-effort probe issues impossible to diagnose. In the affected try/except blocks around consumer.close(autocommit=False) in the CLI flow, keep the stress run alive but replace the bare except/pass with debug logging that records the exception details and context, then continue. Apply the same fix in both matching handlers so the behavior is consistent and discoverable without becoming noisy.

Source: Coding guidelines

coderabbitai · 2026-06-24T16:25:25Z

+        # Probe baselines on the calling thread so the first periodic emit can
+        # diff against them without racing the worker's startup.
+        self._initial_input = self._probe_input()
+        self._initial_output = self._probe_output()
+        self._thread = threading.Thread(target=self._run, name='stress-progress', daemon=True)


🩺 Stability & Availability | 🟠 Major | ⚡ Quick win

Make reporter baseline probes best-effort too.

start() runs the initial probes before the background thread exists, so a transient Kafka probe failure in --verbose mode can abort cmd_run after the consumer has already started. Match _run()’s best-effort behavior and fall back to zero baselines.

Proposed fix

self._start_monotonic = time.monotonic() # Probe baselines on the calling thread so the first periodic emit can # diff against them without racing the worker's startup. - self._initial_input = self._probe_input() - self._initial_output = self._probe_output() + try: + self._initial_input = self._probe_input() + self._initial_output = self._probe_output() + except Exception: + logger.debug('failed to capture stress progress baselines', exc_info=True) + self._initial_input = 0 + self._initial_output = 0 self._thread = threading.Thread(target=self._run, name='stress-progress', daemon=True)

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

# Probe baselines on the calling thread so the first periodic emit can

# diff against them without racing the worker's startup.

self._initial_input = self._probe_input()

self._initial_output = self._probe_output()

self._thread = threading.Thread(target=self._run, name='stress-progress', daemon=True)

# Probe baselines on the calling thread so the first periodic emit can

# diff against them without racing the worker's startup.

try:

self._initial_input = self._probe_input()

self._initial_output = self._probe_output()

except Exception:

logger.debug('failed to capture stress progress baselines', exc_info=True)

self._initial_input = 0

self._initial_output = 0

self._thread = threading.Thread(target=self._run, name='stress-progress', daemon=True)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@osprey_worker/src/osprey/worker/stress/cli.py` around lines 131 - 135, Make the initial reporter probes in start() best-effort like _run() so a transient Kafka failure in --verbose mode does not abort cmd_run after the consumer has already started. Update the startup path around _probe_input() and _probe_output() in StressProgressReporter to catch probe errors and fall back to zero baselines, matching the existing guarded behavior in _run().

chimosky · 2026-06-25T20:11:48Z

+    run.add_argument(
+        '--verbose',
+        action='store_true',
+        help='Emit periodic progress lines to stderr (live throughput, topic deltas).',
+    )


This would be great as an argument to the main parser, and not just the run sub parser.

So verbose can be applied to both commands.

addressed your feedback in my latest commits! thank you so much for the review!

chimosky · 2026-06-25T21:00:19Z

Reviewed 29e03ad, it'll be great to log errors and exceptions rather than pass on them.

Orchestrates the stress reporter (#328), producer (#329), and consumer (#330) into a CLI that produces synthetic events at a configurable rate, observes their ExecutionResults, and reports drop rate + p50/p95/p99 latency. Exits non-zero on threshold breach so it can gate CI on pipeline health. Subcommands: * `run` — closed-loop synthetic testing. Blocked on #330; the body lazy-imports the consumer and short-circuits with a clear pointer message until that PR merges, then activates. * `measure` — open-loop measurement against an external source. Stub; will activate once #236 (jetstream input stream plugin) lands. Argparse + threshold gates + report dispatch + entry-point registration are all live today, so the CLI structure (~240 lines) is reviewable independently. When #330 merges, the only change needed here is dropping the `try/except ImportError` wrapper around the consumer import — the orchestration body below it already references Consumer/ConsumerConfig. Adds: * `osprey-stress` console script in `osprey_worker/pyproject.toml` * CHANGELOG entry citing the full stack of PRs the harness landed across * 7 unit tests covering arg parsing, the measure stub, and the #330-blocked-on-import path so a regression in either is loud

Promotes the consumer import to module level alongside producer and reporter, removes the try/except ImportError stub in cmd_run, the matching `# type: ignore[import-untyped]`, and the `TestRunBlocked` test that pinned the stub behaviour. The rest of the CLI (arg parsing, threshold gates, report dispatch, measure stub) already exercised these code paths.

Addresses CodeRabbit feedback on #367. Previously the parser accepted 0 or negatives for the numeric args; a `--rate 0` would then ZeroDivision when computing `max_runtime_seconds = events / rate + drain_seconds`, which is a worse failure mode than a clear argparse error. Adds three small type-validator helpers (`_positive_int`, `_positive_float`, `_nonneg_float`) that raise `ArgumentTypeError` at parse time, plus a parametrized test covering the seven invalid-input combinations (events ≤ 0, rate ≤ 0, drain-seconds < 0, duration ≤ 0).

Addresses chimosky's review on #367, where runs got progressively worse across repeated invocations (0% → 9% → 41% → 66% drop rate over 4 runs). The harness was reporting the truth — pipeline saturated, each run leaves backlog the next inherits — but the framing made it look like a harness regression rather than the SUT one. - `probe_topic_head` snapshots an input/output topic's end-offsets via a groupless KafkaConsumer. Tolerant of missing topics (fresh clusters). - Before producing, log baseline heads. After the run, log the deltas; if output_topic advanced by fewer than `--events`, warn that backlog will carry into the next run. - `--verbose` (+ `--verbose-interval-seconds`) emits live stderr lines: produced/target, matched, match_rate, in/out topic deltas. Avoids having to tail `docker compose logs` to know what's happening. - 13 new unit tests cover the verbose-flag parsing, topic-head probing edge cases (missing topic, close failures, partition assignment), and the ProgressReporter (periodic emission, double-start guard, probe-failure tolerance). Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

Also from chimosky's review on #367: the event-stream scrollbar was "barely visible," which made it slow to skim long result sets. We were relying on the OS auto-hide default (essentially invisible on macOS unless you're actively scrolling). Style the `ReactVirtualized__List` scrolling div with a thin always-visible bar in the existing `--text-light-secondary` / `--background-secondary` theme variables, plus a hover state. WebKit (Chrome/Safari) and Firefox both honored. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

CodeRabbit + chimosky review on #367: - The bare except/pass around `consumer.close(autocommit=False)` and inside `ProgressReporter._run()` hid every transient broker hiccup. Replace with `logger.debug(..., exc_info=True)` so repeated failures are diagnosable without becoming user-visible noise. - `ProgressReporter.start()` called `_probe_input()` and `_probe_output()` directly on the calling thread; a transient probe failure there aborted `cmd_run` even though the consumer was already running. Wrap each in try/except with a 0 baseline fallback to match the guarded behavior `_run()` already had. - The two threaded ProgressReporter tests used fixed time.sleep() windows to wait for tick output, which flakes on busy CI runners. Replace with a `_wait_until(predicate, deadline=5s)` helper that polls the sink and waits for the specific state the test cares about (in_topic_Δ value, non-empty sink) before stopping. Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>

coderabbitai

♻️ Duplicate comments (1)

osprey_ui/src/components/event_stream/EventStream.module.css (1)
21-48: 📐 Maintainability & Code Quality | 🟠 Major | ⚡ Quick win

Stylelint still flags valid CSS Modules :global(...) selectors.

This file uses the .module.css extension, confirming it is a CSS Module. The :global(...) pseudo-class is standard CSS Modules syntax for targeting external class names like ReactVirtualized__List, but Stylelint's selector-pseudo-class-no-unknown rule does not recognize it without additional configuration. This will break CI if stylelint runs in the pipeline.

Please add stylelint-config-css-modules (or equivalent) to the project's stylelint configuration, or add a narrowly-scoped disable comment with justification if a project-wide fix is not viable right now.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@osprey_ui/src/components/event_stream/EventStream.module.css` around lines 21
- 48, Stylelint is incorrectly flagging the CSS Modules :global(...) selectors
used in EventStream.module.css. Update the project’s Stylelint setup to
recognize CSS Modules syntax by adding stylelint-config-css-modules (or
equivalent) to the configuration used for selector-pseudo-class-no-unknown, so
the ReactVirtualized__List rules in this module pass CI. If a global config
change is not feasible, add a narrowly scoped disable comment around the
affected selectors with a clear justification.
Source: Linters/SAST tools

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@osprey_ui/src/components/event_stream/EventStream.module.css`:
- Around line 21-48: Stylelint is incorrectly flagging the CSS Modules
:global(...) selectors used in EventStream.module.css. Update the project’s
Stylelint setup to recognize CSS Modules syntax by adding
stylelint-config-css-modules (or equivalent) to the configuration used for
selector-pseudo-class-no-unknown, so the ReactVirtualized__List rules in this
module pass CI. If a global config change is not feasible, add a narrowly scoped
disable comment around the affected selectors with a clear justification.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 656ad55b-2a08-44a1-b00f-d85c35cbd7f7

📥 Commits

Reviewing files that changed from the base of the PR and between 29e03ad and 4b9ea61.

📒 Files selected for processing (3)

CHANGELOG.md
osprey_ui/src/components/event_stream/EventStream.module.css
osprey_worker/pyproject.toml

✅ Files skipped from review due to trivial changes (1)

CHANGELOG.md

🚧 Files skipped from review as they are similar to previous changes (1)

osprey_worker/pyproject.toml

julietshen force-pushed the add-stress-cli branch from f908913 to f98dd4c Compare June 17, 2026 13:58

julietshen marked this pull request as ready for review June 17, 2026 13:58

julietshen requested review from a team, EXBreder, ayubun, haileyok and vinaysrao1 as code owners June 17, 2026 13:58

julietshen mentioned this pull request Jun 17, 2026

Add stress.consumer — Kafka result consumer for the stress harness #330

Merged

3 tasks

coderabbitai Bot reviewed Jun 17, 2026

View reviewed changes

Comment thread osprey_worker/src/osprey/worker/stress/cli.py Outdated

haileyok mentioned this pull request Jun 17, 2026

[worker] Force-exit gevent test process after run to fix integration CI hang #369

Closed

4 tasks

julietshen force-pushed the add-stress-cli branch from ffca77c to 993cde3 Compare June 17, 2026 19:59

coderabbitai Bot reviewed Jun 24, 2026

View reviewed changes

chimosky reviewed Jun 25, 2026

View reviewed changes

julietshen and others added 6 commits June 30, 2026 11:42

julietshen force-pushed the add-stress-cli branch from 29e03ad to 4b9ea61 Compare June 30, 2026 15:50

coderabbitai Bot reviewed Jun 30, 2026

View reviewed changes

Uh oh!

Conversation

julietshen commented Jun 17, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What's in this PR

Test plan

Review feedback addressed

Stack reminder

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

❌ Failed checks (3 warnings)

Uh oh!

julietshen commented Jun 17, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

julietshen commented Jun 17, 2026

Uh oh!

chimosky commented Jun 18, 2026

Uh oh!

julietshen commented Jun 24, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

chimosky Jun 25, 2026

Choose a reason for hiding this comment

Uh oh!

julietshen Jun 30, 2026

Choose a reason for hiding this comment

Uh oh!

chimosky commented Jun 25, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

julietshen commented Jun 17, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 17, 2026 •

edited

Loading