feat(oracle): unified tier-driven corpus runner + oracle.yml (C3) by skakri · Pull Request #177 · cq27-dev/rag-rat

skakri · 2026-06-15T20:32:22Z

Part of the multi-language SCIP-oracle runner epic (#164). Replaces the two per-language demo workflows with one declarative, tier-driven runner over tools/oracle-corpora.toml — the C3 piece that ties the corpus profiles (#171), the report command (#175), and the scip-python backend (#176) into a CI-runnable whole.

What

tools/oracle-run.sh — run the oracle for one corpus end to end: read its profile → shallow-clone the repo at its pinned rev → run its prepare steps → render the checkout's rag-rat.toml from its bindings → rag-rat index --full → rag-rat oracle report. The health gate lives in oracle report; the script propagates its exit code (non-zero on a violation) and always leaves the report JSON behind so a Δ glue script can consume even a failing run.
tools/oracle-corpus.py — stdlib (tomllib) reader the bash runner shells out to for the pre-index fields (repo/rev/prepare/bindings) and --list-tier for the CI matrix. oracle report reads tool/bindings/health from the same file itself — the helper is just bash's TOML eyes.
tools/oracle-report-bmf.py — report JSON → Bencher BMF glue for the heavy tier (rag-rat emits JSON only; Bencher/markdown shaping is a glue concern).
.github/workflows/oracle.yml — tier-driven:
- small (PRs + push to main): GitHub-hosted matrix from --list-tier small, installs each corpus's SCIP tool by its tool field, runs the runner, uploads the report artifact. The health gate is the PR gate.
- heavy (release / manual dispatch only): self-hosted big-memory box, serial (max-parallel: 1), bench image, pushed to Bencher as the headline resolution series.
Deletes the superseded oracle-rust.yml / oracle-kernel.yml workflows and their rust-scip-oracle.sh / kernel-c-oracle.sh scripts (full migration, no shims); updates docs/benchmarks.md.

Verification

End-to-end locally on the small tier (with the #175 oracle report binary):

corpus	tool	edges	resolved before→after	result
rust-semver	rust-analyzer	1056	412 → 936	healthy, exit 0
c-cjson	cmake compdb + scip-clang	3941	2742 → 3408	healthy, exit 0

Both clone → prepare → index → report → gate clean; the corpus_profile_hash matched the golden. The health gate was separately confirmed to fail non-zero on a violated threshold (in #175). bash -n, py_compile, and YAML parse all pass.

Stacking

oracle.yml's small tier needs the oracle report command (#175) and, for py-requests, the scip-python backend (#176). The script + helpers are independent of those at the file level (disjoint paths), but the workflow's py-requests leg goes green only once #176 lands and rust-*/c-* legs only once #175 lands. Merge order: #175 + #176, then this.

codecov · 2026-06-15T20:39:23Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ All tests successful. No failed tests found.

📢 Thoughts on this report? Let us know!

github-actions · 2026-06-15T20:42:37Z

Bencher Report

Branch	feat/oracle-unified-runner
Testbed	ubuntu-latest

⚠️ WARNING: No Threshold found!
Without a Threshold, no Alerts will ever be generated.
Estimated Cycles (cycles)
LL Hits (hits)
L1 Hits (hits)
RAM Hits (hits)
Total read+write (reads/writes)
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds flag.

🚨 2 Alerts

Benchmark	Measure Units	View	Benchmark Result (Result Δ%)	Upper Boundary (Limit %)
rag_pipeline::pipeline::index cargo_resolver:resolver_config()	Instructions instructions x 1e6	📈 plot 🚷 threshold 🚨 alert (🔔)	800.17 x 1e6 (+39.29%) Baseline: 574.45 x 1e6	585.94 x 1e6 (136.56%)
rag_pipeline::pipeline::query_cold cargo_resolver:resolver_built_config()	Instructions instructions x 1e6	📈 plot 🚷 threshold 🚨 alert (🔔)	154.66 x 1e6 (+15.59%) Baseline: 133.80 x 1e6	136.47 x 1e6 (113.33%)

Click to view all benchmark results

Benchmark	Estimated Cycles	cycles x 1e6	Instructions	Benchmark Result instructions x 1e6 (Result Δ%)	Upper Boundary instructions x 1e6 (Limit %)	L1 Hits	hits x 1e6	LL Hits	hits x 1e6	RAM Hits	hits x 1e3	Total read+write	reads/writes x 1e6
rag_pipeline::pipeline::index cargo_resolver:resolver_config()	📈 view plot ⚠️ NO THRESHOLD	1,213.35 x 1e6	📈 view plot 🚷 view threshold 🚨 view alert (🔔)	800.17 x 1e6 (+39.29%) Baseline: 574.45 x 1e6	585.94 x 1e6 (136.56%)	📈 view plot ⚠️ NO THRESHOLD	1,115.38 x 1e6	📈 view plot ⚠️ NO THRESHOLD	17.40 x 1e6	📈 view plot ⚠️ NO THRESHOLD	312.91 x 1e3	📈 view plot ⚠️ NO THRESHOLD	1,133.10 x 1e6
rag_pipeline::pipeline::query_cold cargo_resolver:resolver_built_config()	📈 view plot ⚠️ NO THRESHOLD	241.18 x 1e6	📈 view plot 🚷 view threshold 🚨 view alert (🔔)	154.66 x 1e6 (+15.59%) Baseline: 133.80 x 1e6	136.47 x 1e6 (113.33%)	📈 view plot ⚠️ NO THRESHOLD	230.99 x 1e6	📈 view plot ⚠️ NO THRESHOLD	1.96 x 1e6	📈 view plot ⚠️ NO THRESHOLD	11.51 x 1e3	📈 view plot ⚠️ NO THRESHOLD	232.96 x 1e6
rag_pipeline::pipeline::query_warm cargo_resolver:resolver_index()	📈 view plot ⚠️ NO THRESHOLD	232.82 x 1e6	📈 view plot 🚷 view threshold	149.15 x 1e6 (-8.98%) Baseline: 163.86 x 1e6	167.13 x 1e6 (89.24%)	📈 view plot ⚠️ NO THRESHOLD	223.20 x 1e6	📈 view plot ⚠️ NO THRESHOLD	1.85 x 1e6	📈 view plot ⚠️ NO THRESHOLD	10.11 x 1e3	📈 view plot ⚠️ NO THRESHOLD	225.06 x 1e6

🐰 View full continuous benchmarking report in Bencher

github-actions · 2026-06-15T20:42:39Z

Bencher Report

Branch	feat/oracle-unified-runner
Testbed	ubuntu-latest

Click to view all benchmark results

Benchmark	Latency	Benchmark Result seconds (s) (Result Δ%)	Upper Boundary seconds (s) (Limit %)
index_time/full_rebuild_cargo	📈 view plot 🚷 view threshold	4.96 s (+33.39%) Baseline: 3.72 s	5.77 s (85.99%)

🐰 View full continuous benchmarking report in Bencher

- Roll back an unhealthy run (P2): a corpus that fails its health gate had already committed edge_oracle/monikers/oracle_runs, so it became the authoritative latest run and surfaced untrustworthy Compiler verdicts in later status/query despite the non-zero exit. `oracle report` now rolls the run back (delete verdicts + monikers + the oracle_runs row) atomically inside the same write lock when the gate fails — new oracle::rollback_run + store::delete_oracle_run + IndexDatabase::rollback_oracle_run. Verified e2e: after a failed gate, edge_oracle/oracle_runs/monikers are empty and status reports no verdicts. - Validate the checkout matches the corpus before stamping its profile (P2): fail closed unless the active checkout's target bindings (language -> dirs) exactly equal the corpus profile's bindings, so `oracle report --corpus X` can't stamp X's corpus_profile_hash onto a different population. New ensure_checkout_matches_corpus + unit test. The third comment (honor timeout_minutes in the report run) is already handled one layer up: timeout_minutes is the corpus wall-clock budget the runner owns (documented on CorpusHealth), and tools/oracle-run.sh (C3, #177) wraps the `oracle report` invocation in `timeout ${timeout_minutes}m`.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7d99df400c

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-15T20:44:18Z

+    timeout "${TIMEOUT_MINUTES}m" \
+    "$RAG_RAT_BIN" --json oracle report --corpus "$CORPUS" --corpora "$CORPORA" ) > "$REPORT_OUT"


Apply the corpus timeout to the whole run

The profile timeout is documented as the wall-clock budget for the whole corpus run, but this timeout only starts after cloning, all prepare steps, and rag-rat index --full have already completed. If cargo fetch, the kernel make -j, or indexing hangs or runs far past the corpus budget, the small PR jobs can sit until the default Actions timeout and the heavy job until its 360-minute job timeout instead of failing at the configured 8/45/120 minutes.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-15T20:44:18Z

+        # Mirror the engine's vacuous-1.0 convention for an empty denominator.
+        return 100.0 if total == 0 else 100.0 * numerator / total
+
+    name = f"{report['corpus_id']}/oracle"


Include profile identity in the Bencher series

This benchmark name is only keyed by corpus_id, so when a profile changes while keeping the same id (for example bumping rust-cargo/linux-kernel rev, bindings, or prepare steps), Bencher will append the new measurements to the old series even though tools/oracle-corpora.toml explicitly treats those reports as incomparable via corpus_profile_hash. The deleted scripts at least carried the tag in the benchmark name; this should include the rev/profile hash (and ideally tool version) or otherwise force a new series on corpus changes.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-15T20:44:18Z

+while IFS= read -r prepare_cmd; do
+    [ -n "$prepare_cmd" ] || continue
+    echo "oracle-run: prepare> $prepare_cmd" >&2
+    ( cd "$CHECKOUT" && bash -c "$prepare_cmd" )


Bound kernel make parallelism

For the linux-kernel profile this executes the declared make -j literally; I checked GNU make --help, and -j [N] means infinite jobs when no argument is supplied. The old kernel oracle script bounded the build with -j$(nproc), so the heavy self-hosted run can now oversubscribe CPU/RAM with unbounded kernel compile jobs before it ever reaches the oracle report.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-15T20:44:18Z

+while IFS= read -r prepare_cmd; do
+    [ -n "$prepare_cmd" ] || continue
+    echo "oracle-run: prepare> $prepare_cmd" >&2
+    ( cd "$CHECKOUT" && bash -c "$prepare_cmd" )


Preserve Python virtualenv for scip-python

The py-requests profile installs the package into .venv, but each prepare command runs in a child shell and the later oracle subprocess inherits none of that environment. Sourcegraph's scip-python usage notes say to activate the virtualenv before running scip-python index (https://github.com/sourcegraph/scip-python#usage), so this small-tier leg runs against the global Python environment and can emit few or no dependency monikers despite the install step.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-15T20:44:18Z

+on:
+  pull_request:
+    paths:
+      - 'crates/**'


Trigger the oracle on workspace dependency changes

This PR gate is meant to run when the rag-rat binary can change, but the path filter only covers crates/** and the oracle tool files. Root Cargo.toml defines workspace dependencies and Cargo.lock pins the actual dependency versions, so a dependency/profile update that touches only those root files skips the small oracle matrix entirely on pull requests and can merge parser/oracle behavior changes without this health gate.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-15T20:44:18Z

+  # produced by the same indexer build the heavy/Bencher tier uses.
+  SCIP_CLANG_VERSION: v0.4.0
+  SCIP_PYTHON_VERSION: 0.6.6
+  RUST_ANALYZER_URL: https://github.com/rust-lang/rust-analyzer/releases/latest/download/rust-analyzer-x86_64-unknown-linux-gnu.gz


Pin the rust-analyzer download

The small tier is described as a pinned toolchain, but this downloads releases/latest on every run. When rust-analyzer publishes a new build, the rust-semver PR leg can start using a different SCIP emitter than the thresholds and heavy image were validated with, producing failures or report changes unrelated to the PR while the other tools remain version-pinned.

Useful? React with 👍 / 👎.

Six P2s: - Whole-run timeout (oracle-run.sh): the corpus wall-clock budget wrapped only `oracle report`, so a hung clone/prepare/index sat until the Actions/job timeout. The runner now re-execs itself once under `timeout -k 60s <budget>m`, covering clone + prepare + index + report; an EXIT trap still removes the checkout on a timeout/gate-fail. - Preserve the virtualenv (oracle-run.sh): activate a prepare-created `.venv` (VIRTUAL_ENV + PATH) before index/report so scip-python (pyright) resolves against the project's installed deps, not the global interpreter. - Bencher series identity (oracle-report-bmf.py): the benchmark name was keyed only by corpus_id, so a profile/tool-version change (which makes reports incomparable) appended to the old series. Now keyed by corpus_id@<profile_hash12>+<tool_version> so an incomparable change starts a fresh series. - Bound kernel make (oracle-corpora.toml): `make -j` is unlimited jobs; pinned to `make -j$(nproc)` so the heavy run doesn't oversubscribe the box. Recomputed the golden linux-kernel profile hash. - Trigger on dep changes (oracle.yml): added root Cargo.toml/Cargo.lock to the PR + push path filters so a workspace-dep bump can't skip the gate. - Pin rust-analyzer (oracle.yml): install it as a rustup component (pinned to the stable toolchain) instead of downloading releases/latest each run.

skakri · 2026-06-15T21:10:20Z

Addressed in 202cbf0 — all six.

P2 — Apply the corpus timeout to the whole run ✅ The runner now re-execs itself once under timeout -k 60s ${timeout_minutes}m, so the budget covers clone + every prepare step + index + report, not just the report. An EXIT trap still removes the checkout on a timeout/gate-fail. Verified e2e (rust-semver: single re-exec, "budget 8m", completes healthy).

P2 — Preserve Python virtualenv for scip-python ✅ After the prepare steps, the runner activates a .venv it finds in the checkout (VIRTUAL_ENV + PATH) before indexing/reporting, so scip-python (pyright) resolves against the project's installed deps rather than the global interpreter.

P2 — Include profile identity in the Bencher series ✅ oracle-report-bmf.py now names the benchmark {corpus_id}@{profile_hash[:12]}+{tool_version}/oracle (was {corpus_id}/oracle), so a profile change (rev/bindings/prepare/threshold/tool) or a SCIP-indexer bump — both of which make reports incomparable — starts a fresh series instead of appending to the old one.

P2 — Bound kernel make parallelism ✅ tools/oracle-corpora.toml now uses make -j$(nproc) (was make -j, i.e. unlimited jobs). Recomputed the golden linux-kernel profile hash to match.

P2 — Trigger the oracle on workspace dependency changes ✅ Added root Cargo.toml and Cargo.lock to the pull_request + push path filters, so a workspace-dep/lock bump that changes the built binary can't skip the small-tier gate.

P2 — Pin the rust-analyzer download ✅ rust-analyzer is now installed as a rustup component (dtolnay/rust-toolchain@stable with components: rust-analyzer, resolved via rustup which), pinning the SCIP emitter to the stable toolchain (~6-week cadence) instead of releases/latest (weekly).

bash -n, py_compile, YAML parse, cargo +nightly fmt --check, clippy, and the core-oracle suite (incl. the recomputed golden) are all green.

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 202cbf0a73

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-15T21:20:42Z

+echo "oracle-run: oracle report --corpus $CORPUS" >&2
+set +e
+( cd "$CHECKOUT" && RAG_RAT_COMMIT="$RAG_RAT_COMMIT" \
+    "$RAG_RAT_BIN" --json oracle report --corpus "$CORPUS" --corpora "$CORPORA" ) > "$REPORT_OUT"


Add the missing oracle report CLI before invoking it

In this tree the oracle CLI only exposes run and status (crates/rag-rat-cli/src/cli.rs), so every small and heavy corpus reaches this line and then exits with an unknown report subcommand after cloning/indexing. The new workflow depends on this runner for all corpora, so the PR gate and Bencher path cannot produce a valid report until the report subcommand is included or this calls an existing command.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-15T21:20:42Z

+            scip-python)
+              npm install -g "@sourcegraph/scip-python@${SCIP_PYTHON_VERSION}"
+              scip-python --version ;;


Add scip-python support before running py-requests

For the py-requests small-matrix leg, this branch installs scip-python, but the checked-in oracle registry still only accepts rust-analyzer and scip-clang (OracleTool::ALL/from_db_str), so once the report command exists that leg cannot map the profile's tool = "scip-python" to a runnable backend and will fail the required PR matrix. Either exclude the Python corpus here or land the scip-python backend with the workflow that starts scheduling it.

Useful? React with 👍 / 👎.

skakri · 2026-06-15T21:33:35Z

The two new P1s ("add the oracle report CLI before invoking it" and "add scip-python support before running py-requests") are both the same PR-stack ordering point, and they're correct: this PR is the top of a 3-PR stack and isn't meant to merge standalone.

oracle report is added by feat(oracle): oracle report --corpus <id> — run a corpus + emit its C2 resolution report (C2-CLI) #175 (C2-CLI).
the scip-python backend (OracleTool::from_db_str / ALL) is added by feat(oracle): scip-python backend — Python compiler-grade resolution (B6) #176 (B6).
this PR (feat(oracle): unified tier-driven corpus runner + oracle.yml (C3) #177) is the runner + oracle.yml that drives them.

Merge order is #175 → #176 → #177: once the first two land on main, I rebase #177 onto main and its tree has both the report subcommand and the scip-python backend, so the small matrix (incl. the py-requests leg) and the heavy/Bencher path resolve. Until then #177's own CI is red purely because its dependencies aren't in this branch — the stacking artifact, not a defect in the runner. I'd rather not duplicate #175/#176 into this PR or temporarily exclude the Python corpus (scaffolding I'd just have to revert).

The earlier P2 round (pin rust-analyzer, trigger on Cargo.toml/Cargo.lock, preserve the venv, bound kernel make -j$(nproc), profile-identity in the Bencher series) is addressed in 202cbf0 — those threads can be resolved.

… C2 resolution report (C2-CLI) (#175) * feat(oracle): `oracle report --corpus <id>` — run a corpus + emit its C2 resolution report (C2-CLI) Adds the CLI surface over the C2 report contract: load a corpus profile from tools/oracle-corpora.toml, run the oracle (produce a .scip with the corpus's tool, or consume a pre-built --scip), assemble the typed OracleResolutionReport, and emit it as JSON/TOON. Then apply the per-corpus health gate — a violated threshold exits non-zero even when the oracle command itself succeeded. - The report is printed to stdout unconditionally, before the gate, so a Δ glue script can consume it even for a failing run; violations go to stderr. - Unlike `oracle run`, a missing/unrunnable tool is a hard error here (not the exit-0 Blocked UX): this is a measurement runner over a corpus whose tool CI is expected to have installed, so a silent skip must not pass green. - run + resolution_report run under one write lock; .scip production stays outside it (the #82 P3 lock-free-production posture). - rag_rat_commit provenance reads $RAG_RAT_COMMIT (CI's git SHA), falling back to the crate version off CI. * fix(oracle): address Codex review on the report command (#175) - Roll back an unhealthy run (P2): a corpus that fails its health gate had already committed edge_oracle/monikers/oracle_runs, so it became the authoritative latest run and surfaced untrustworthy Compiler verdicts in later status/query despite the non-zero exit. `oracle report` now rolls the run back (delete verdicts + monikers + the oracle_runs row) atomically inside the same write lock when the gate fails — new oracle::rollback_run + store::delete_oracle_run + IndexDatabase::rollback_oracle_run. Verified e2e: after a failed gate, edge_oracle/oracle_runs/monikers are empty and status reports no verdicts. - Validate the checkout matches the corpus before stamping its profile (P2): fail closed unless the active checkout's target bindings (language -> dirs) exactly equal the corpus profile's bindings, so `oracle report --corpus X` can't stamp X's corpus_profile_hash onto a different population. New ensure_checkout_matches_corpus + unit test. The third comment (honor timeout_minutes in the report run) is already handled one layer up: timeout_minutes is the corpus wall-clock budget the runner owns (documented on CorpusHealth), and tools/oracle-run.sh (C3, #177) wraps the `oracle report` invocation in `timeout ${timeout_minutes}m`. * fix(oracle): reject custom target filters in oracle report's corpus check (#175) Codex follow-up: ensure_checkout_matches_corpus compared only language -> directory set, so a [[target]] with the same language+dirs but custom include/exclude filters slipped through — the indexer applies those filters, so the report could stamp the corpus_profile_hash onto a filtered subset/superset. Now also require each target to carry the simple [target_bindings] form's default filters (include = ["**/*.<ext>"], empty exclude); any custom filter fails closed. Extended the unit test with custom-exclude and narrowed-include cases. * fix(oracle): make oracle report's run provisional, not rollback-after-commit (#175) Codex round 2 found my post-hoc rollback couldn't restore state: run::run's authoritative clear destroys the prior (tool,version) verdicts + the tool's monikers at the run's START, so deleting the failed run afterward left a prior healthy run with no verdicts/monikers (NoData), and the version-keyed delete could erase a prior healthy same-version run. Fix: the report path no longer commits-then-maybe-deletes. run::run is split into run() (commit-on-success wrapper) + run_in_tx() (the body). New oracle::run_oracle_report runs run_in_tx + report assembly + the health gate inside ONE transaction and commits ONLY if healthy; an unhealthy run drops the transaction, rolling back the whole pass INCLUDING the authoritative clear — so the previous healthy run's verdicts/monikers/run-row are fully preserved. Removes rollback_run / delete_oracle_run / rollback_oracle_run / the finalize_corpus_report helper. Verified e2e: a healthy run then a gate-failing run on the same checkout leaves the healthy run's verdicts intact (oracle status still reports them).

Replaces the per-language oracle-rust.yml / oracle-kernel.yml demos with one declarative, tier-driven runner over tools/oracle-corpora.toml (#164, C3). - tools/oracle-run.sh: run the oracle for ONE corpus end to end — read its profile, shallow-clone the repo at its pinned rev, run its prepare steps, index it with rag-rat, then `rag-rat oracle report` (which runs the oracle + applies the per-corpus health gate). Exits non-zero on a health violation while still writing the report JSON, so a Δ glue script can consume it. - tools/oracle-corpus.py: stdlib (tomllib) reader the bash runner shells out to for the pre-index fields (repo/rev/prepare/bindings) + tier corpus listing. - tools/oracle-report-bmf.py: report JSON -> Bencher BMF glue for the heavy tier (rag-rat emits JSON only; presentation/Bencher shaping is a glue concern). - .github/workflows/oracle.yml: small tier on PRs + main (GitHub-hosted matrix, per-corpus tool install, report artifact, health gate as the PR gate); heavy tier on release/dispatch (self-hosted bigmem, serial, pushed to Bencher). - Deletes the superseded oracle-rust.yml/oracle-kernel.yml workflows and their rust-scip-oracle.sh/kernel-c-oracle.sh scripts; updates docs/benchmarks.md. Verified end to end locally on the small tier: rust-semver (rust-analyzer, 1056 edges resolved 412->936) and c-cjson (cmake compdb + scip-clang, 3941 edges resolved 2742->3408) both run clean through the runner and pass the gate.

Six P2s: - Whole-run timeout (oracle-run.sh): the corpus wall-clock budget wrapped only `oracle report`, so a hung clone/prepare/index sat until the Actions/job timeout. The runner now re-execs itself once under `timeout -k 60s <budget>m`, covering clone + prepare + index + report; an EXIT trap still removes the checkout on a timeout/gate-fail. - Preserve the virtualenv (oracle-run.sh): activate a prepare-created `.venv` (VIRTUAL_ENV + PATH) before index/report so scip-python (pyright) resolves against the project's installed deps, not the global interpreter. - Bencher series identity (oracle-report-bmf.py): the benchmark name was keyed only by corpus_id, so a profile/tool-version change (which makes reports incomparable) appended to the old series. Now keyed by corpus_id@<profile_hash12>+<tool_version> so an incomparable change starts a fresh series. - Bound kernel make (oracle-corpora.toml): `make -j` is unlimited jobs; pinned to `make -j$(nproc)` so the heavy run doesn't oversubscribe the box. Recomputed the golden linux-kernel profile hash. - Trigger on dep changes (oracle.yml): added root Cargo.toml/Cargo.lock to the PR + push path filters so a workspace-dep bump can't skip the gate. - Pin rust-analyzer (oracle.yml): install it as a rustup component (pinned to the stable toolchain) instead of downloading releases/latest each run.

chatgpt-codex-connector · 2026-06-15T21:39:04Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

skakri mentioned this pull request Jun 15, 2026

feat(oracle): oracle report --corpus <id> — run a corpus + emit its C2 resolution report (C2-CLI) #175

Merged

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

skakri added 2 commits June 16, 2026 00:36

skakri force-pushed the feat/oracle-unified-runner branch from 202cbf0 to e9226e3 Compare June 15, 2026 21:38

skakri merged commit a69f06f into main Jun 15, 2026
15 checks passed

skakri deleted the feat/oracle-unified-runner branch June 15, 2026 21:41

This was referenced Jun 15, 2026

chore: release v0.7.0 #161

Open

Multi-language SCIP oracle + Python (small-corpus runner) #164

Closed

		timeout "${TIMEOUT_MINUTES}m" \
		"$RAG_RAT_BIN" --json oracle report --corpus "$CORPUS" --corpora "$CORPORA" ) > "$REPORT_OUT"

Conversation

skakri commented Jun 15, 2026

What

Verification

Stacking

Uh oh!

codecov Bot commented Jun 15, 2026

Codecov Report

Uh oh!

github-actions Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

⚠️ WARNING: No Threshold found!

🚨 2 Alerts

Uh oh!

github-actions Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

skakri commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

skakri commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 15, 2026 •

edited

Loading

github-actions Bot commented Jun 15, 2026 •

edited

Loading