feat(oracle): scip-python backend — Python compiler-grade resolution (B6) by skakri · Pull Request #176 · cq27-dev/rag-rat

skakri · 2026-06-15T20:19:01Z

Part of the multi-language SCIP-oracle runner epic (#164). Adds the B6 scip-python backend so the SCIP oracle resolves Python edges to the Compiler tier — unblocking oracle report --corpus py-requests (the Python corpus was already declared in tools/oracle-corpora.toml from #171, and #175 added the report command; this makes its tool runnable).

What

OracleTool::ScipPython added to the oracle tool registry:

Manifest entry: program scip-python, language python, install hint.
Capability probe: scip-python index --help exits 0 (the analog of rust-analyzer's scip --help check); --version reports the tool version.
Invocation: scip-python index --project-name <root-basename> --cwd <root> --output <abs>. The project name gives in-corpus symbols a non-empty moniker package (what lets count_symbols_with_moniker see them); --cwd is where scip-python resolves the project + its installed deps.
No compile_commands.json-style prerequisite: installing deps is the corpus prepare step's job, and a failed environment shows up as a near-zero moniker count the report health gate already catches — so prerequisite_blocked is None.
CLI --tool scip-python + exhaustive enum coverage; argv unit test.

The SCIP occurrence→edge join is language-agnostic (keyed on document paths + positions), so no join changes were needed.

Operational notes (documented in `docs/oracle.md`)

scip-python derives --project-version from the git revision, so it expects a git checkout — real corpora are cloned at a pinned rev, so this is fine; a non-git dir makes it crash on an undefined version.
Imports resolve against installed dependencies (a venv), so cross-package edges need the deps importable.

Verification

New manifest argv test scip_python_indexes_a_cwd_with_a_project_name; the existing exhaustive every_tool_has_a_manifest_entry now covers the third variant.
Live probe: oracle status --tool scip-python → available 0.6.6.
End-to-end on a small git-initialized Python project (3 files, 8 edges): 5 edges examined → 4 confirmed, 1 resolved-external, 4 monikers written, verdicts persisted.
cargo +nightly fmt --check, cargo clippy --all-targets, and the oracle core + cli suites are green.

Stacking

Independent of #175 (C2-CLI) at the code level — both branch off main and touch disjoint code. Either can merge first; once both land, oracle report --corpus py-requests is fully runnable.

…(B6) Adds OracleTool::ScipPython to the oracle tool registry so the SCIP oracle resolves Python edges (#164 B6, unblocks `oracle report --corpus py-requests`). - Manifest entry: program `scip-python`, language `python`, install hint. - Capability probe: `scip-python index --help` exits 0 (analog of rust-analyzer's `scip --help`); `--version` reports the tool version. - Invocation: `scip-python index --project-name <root-basename> --cwd <root> --output <abs>`. The project name gives in-corpus symbols a non-empty moniker package; `--cwd` is where it resolves the project + installed deps. - No compile_commands.json-style prerequisite: the deps install is the corpus `prepare` step's job, and a failed environment shows up as a near-zero moniker count the report health gate already catches. - CLI `--tool scip-python` + exhaustive enum coverage; argv unit test. The SCIP occurrence->edge join is language-agnostic, so no join changes were needed. Verified end-to-end on a small Python project: 5 edges examined, 4 confirmed, 1 resolved-external, 4 monikers written.

codecov · 2026-06-15T20:25:16Z

Codecov Report

❌ Patch coverage is 88.52459% with 14 lines in your changes missing coverage. Please review.
✅ All tests successful. No failed tests found.

Files with missing lines	Patch %	Lines
crates/rag-rat-cli/src/main.rs	0.00%	6 Missing ⚠️
crates/rag-rat-core/src/index/oracle/manifest.rs	88.63%	5 Missing ⚠️
crates/rag-rat-core/src/index/query_api/graph.rs	96.36%	2 Missing ⚠️
crates/rag-rat-cli/src/cli.rs	0.00%	1 Missing ⚠️

📢 Thoughts on this report? Let us know!

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: ffc151677a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-15T20:30:23Z

+    /// `scip-python index` — Python (#164 B6). Resolves imports against the project's INSTALLED
+    /// dependencies, so the corpus must install them (a venv) first; an unresolved environment
+    /// shows up as a near-zero moniker count the report's health gate catches.
+    ScipPython,


Route scip-python through graph surfacing

When users run rag-rat oracle run --tool scip-python on a Python repo, the verdicts are persisted under the new scip-python tool, but the graph read paths still only query OracleTool::RustAnalyzer in enrich_hops_with_oracle and compare_graph_to_scip (crates/rag-rat-core/src/index/query_api/graph.rs). That means the new backend can complete and show up in status while trace_callees/find_callers/SCIP comparisons never see its rows, so Python edges do not get the advertised compiler tier. Please route surfacing/comparison through the tool registry or select the oracle tool for the edge language before exposing this variant.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-15T20:30:24Z

+                cmd.arg("index")
+                    .arg("--project-name")
+                    .arg(project_name)


Pin scip-python project version

This invocation never passes --project-version, so scip-python falls back to the checkout's git revision (the new docs note this behavior). Because rag-rat stores the full SCIP symbol string in logical_symbol_monikers and later resolves memories by exact moniker for the same tool, every commit changes all Python monikers even when symbols did not move, breaking the advertised stable memory relocation for scip-python runs. Pass a stable project version (for example the conventional _) when invoking the indexer.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-06-15T20:30:24Z

    /// Every known oracle tool, for "report on all tools" surfaces (`oracle status` with no
    /// `--tool`). Later language backends (#72 Kotlin) extend this alongside the enum.
-    pub const ALL: &[OracleTool] = &[Self::RustAnalyzer, Self::ScipClang];
+    pub const ALL: &[OracleTool] = &[Self::RustAnalyzer, Self::ScipClang, Self::ScipPython];


Filter auto-run tools by configured language

Adding ScipPython to OracleTool::ALL also feeds the MCP background auto-run loop, which iterates OracleTool::ALL without checking config.targets. In an auto-run repo where scip-python is installed but Python is not indexed, the server will still invoke the Python indexer against the checkout; if it fails, the error is swallowed and no oracle_runs row is recorded, so the loop retries on every poll. Keep the status registry broad, but gate background runs to tools whose languages are present in the config.

Useful? React with 👍 / 👎.

github-actions · 2026-06-15T20:31:07Z

Bencher Report

Branch	feat/scip-python-oracle
Testbed	ubuntu-latest

⚠️ WARNING: No Threshold found!
Without a Threshold, no Alerts will ever be generated.
Estimated Cycles (cycles)
LL Hits (hits)
L1 Hits (hits)
RAM Hits (hits)
Total read+write (reads/writes)
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the --ci-only-thresholds flag.

🚨 2 Alerts

Benchmark	Measure Units	View	Benchmark Result (Result Δ%)	Upper Boundary (Limit %)
rag_pipeline::pipeline::index cargo_resolver:resolver_config()	Instructions instructions x 1e6	📈 plot 🚷 threshold 🚨 alert (🔔)	800.96 x 1e6 (+40.05%) Baseline: 571.89 x 1e6	583.33 x 1e6 (137.31%)
rag_pipeline::pipeline::query_cold cargo_resolver:resolver_built_config()	Instructions instructions x 1e6	📈 plot 🚷 threshold 🚨 alert (🔔)	154.52 x 1e6 (+15.92%) Baseline: 133.30 x 1e6	135.96 x 1e6 (113.65%)

Click to view all benchmark results

Benchmark	Estimated Cycles	cycles x 1e6	Instructions	Benchmark Result instructions x 1e6 (Result Δ%)	Upper Boundary instructions x 1e6 (Limit %)	L1 Hits	hits x 1e6	LL Hits	hits x 1e6	RAM Hits	hits x 1e3	Total read+write	reads/writes x 1e6
rag_pipeline::pipeline::index cargo_resolver:resolver_config()	📈 view plot ⚠️ NO THRESHOLD	1,214.51 x 1e6	📈 view plot 🚷 view threshold 🚨 view alert (🔔)	800.96 x 1e6 (+40.05%) Baseline: 571.89 x 1e6	583.33 x 1e6 (137.31%)	📈 view plot ⚠️ NO THRESHOLD	1,116.31 x 1e6	📈 view plot ⚠️ NO THRESHOLD	17.47 x 1e6	📈 view plot ⚠️ NO THRESHOLD	309.53 x 1e3	📈 view plot ⚠️ NO THRESHOLD	1,134.09 x 1e6
rag_pipeline::pipeline::query_cold cargo_resolver:resolver_built_config()	📈 view plot ⚠️ NO THRESHOLD	240.97 x 1e6	📈 view plot 🚷 view threshold 🚨 view alert (🔔)	154.52 x 1e6 (+15.92%) Baseline: 133.30 x 1e6	135.96 x 1e6 (113.65%)	📈 view plot ⚠️ NO THRESHOLD	230.81 x 1e6	📈 view plot ⚠️ NO THRESHOLD	1.95 x 1e6	📈 view plot ⚠️ NO THRESHOLD	11.74 x 1e3	📈 view plot ⚠️ NO THRESHOLD	232.77 x 1e6
rag_pipeline::pipeline::query_warm cargo_resolver:resolver_index()	📈 view plot ⚠️ NO THRESHOLD	232.96 x 1e6	📈 view plot 🚷 view threshold	149.14 x 1e6 (-9.07%) Baseline: 164.02 x 1e6	167.30 x 1e6 (89.14%)	📈 view plot ⚠️ NO THRESHOLD	223.15 x 1e6	📈 view plot ⚠️ NO THRESHOLD	1.87 x 1e6	📈 view plot ⚠️ NO THRESHOLD	13.20 x 1e3	📈 view plot ⚠️ NO THRESHOLD	225.03 x 1e6

🐰 View full continuous benchmarking report in Bencher

github-actions · 2026-06-15T20:31:09Z

Bencher Report

Branch	feat/scip-python-oracle
Testbed	ubuntu-latest

Click to view all benchmark results

Benchmark	Latency	Benchmark Result seconds (s) (Result Δ%)	Upper Boundary seconds (s) (Limit %)
index_time/full_rebuild_cargo	📈 view plot 🚷 view threshold	4.84 s (+30.31%) Baseline: 3.72 s	5.77 s (84.00%)

🐰 View full continuous benchmarking report in Bencher

- Route surfacing through ALL backends, not just rust-analyzer (P2): graph read paths hardcoded OracleTool::RustAnalyzer, so scip-python (and scip-clang) verdicts never surfaced as the Compiler tier in find_callers/trace_callees or compare_graph_to_scip. New oracle::latest_runs_in_scope enumerates every tool with a run in the checkout; enrich_hops_with_oracle merges their verdicts and compare_graph_to_scip aggregates across them (verdict sets are disjoint by edge language). Regression test: a scip-clang verdict now surfaces + the report names the contributing tool. - Pin scip-python --project-version _ (P2): it otherwise defaults to the git revision, which is embedded in every SCIP symbol, so every commit churned all Python monikers and broke moniker-anchored memory relocation. The constant keeps monikers stable across commits (and fixes the crash on a non-git checkout, verified e2e). - Gate background auto-run by configured language (P2): the MCP auto-run loop iterated OracleTool::ALL without checking config.targets, so scip-python installed + Python not indexed would invoke the Python indexer every poll, fail, swallow the error, record no run row, and retry forever. Now skips a backend whose language the checkout doesn't index (status registry stays broad).

skakri · 2026-06-15T20:57:24Z

Addressed in 84cc401.

P2 — Route scip-python through graph surfacing ✅ Fixed (and it covered the pre-existing scip-clang gap too). enrich_hops_with_oracle and compare_graph_to_scip no longer hardcode OracleTool::RustAnalyzer — new oracle::latest_runs_in_scope enumerates every backend with a run in the checkout, enrich_hops_with_oracle merges their verdicts, and compare_graph_to_scip aggregates across them (verdict sets are disjoint by edge language, so merging can't collide; the report's tool/tool_version now name all contributing backends). Added a regression test (compare_graph_to_scip_surfaces_non_rust_analyzer_tools) asserting a scip-clang verdict surfaces and the report names the tool.

P2 — Pin scip-python project version ✅ Fixed. scip_command now passes --project-version _, so the SCIP symbol strings (and the monikers persisted from them) stay stable across commits, preserving moniker-anchored memory relocation. Bonus: this also fixes scip-python's crash on a non-git checkout (the git-rev default was undefined there) — verified end-to-end on a non-git Python project (4 edges examined, 3 confirmed, 3 monikers, exit 0). Manifest argv test + docs updated.

P2 — Filter auto-run tools by configured language ✅ Fixed. The MCP background auto-run loop now skips any backend whose languages aren't in config.targets before probing/deciding — so scip-python installed without a Python target no longer spawns a doomed indexer (and silently retries) every poll. The status registry (oracle status, which lists all tools) stays broad; only background runs are gated.

cargo +nightly fmt --check, clippy --all-targets, and the cli + core-oracle suites (incl. the new surfacing test) are green.

chatgpt-codex-connector

💡 Codex Review

rag-rat/crates/rag-rat-core/src/index/query_api/graph.rs

Line 511 in 84cc401

if summary.verdicts_examined == 0 {

Preserve empty-run warnings per backend

In a mixed-language checkout, this aggregate check only warns when every backend has zero current verdicts. If, for example, rust-analyzer has valid verdicts but the new scip-python run produced 0 in-scope rows because the Python indexer saw the wrong paths or all Python files drifted, summary.verdicts_examined is still nonzero and compare_graph_to_scip silently reports no issue for the empty Python backend. Track the per-tool comparison count and warn for each run that contributes 0 rows so one healthy backend does not mask a broken one.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-06-15T21:14:17Z

+    /// `scip-python index` — Python (#164 B6). Resolves imports against the project's INSTALLED
+    /// dependencies, so the corpus must install them (a venv) first; an unresolved environment
+    /// shows up as a near-zero moniker count the report's health gate catches.
+    ScipPython,


Count Python constructor references in recall

Enabling ScipPython sends Python runs through the existing recall-gap code, but count_uncovered_calls only considers SCIP symbols ending in ). callable via scip::symbol_is_callable. scip-python represents class constructor call references such as ClientSession() as references to the class/type symbol (...ClientSession#), while our Python extractor explicitly emits these as CallsName, so constructor calls that the heuristic misses never increment oracle_only_calls; this inflates Python recall and can make the new py-requests oracle report look healthier than it is. Please make the callable filter language/tool-aware before treating scip-python reports as compiler-grade metrics.

Useful? React with 👍 / 👎.

…176) Codex follow-up: the recall numerator (covered_calls) counted a calls_name edge matched to ANY SCIP occurrence, while the denominator's uncovered side requires scip::symbol_is_callable (a `).` method/function symbol). scip-python records a constructor call (`Foo()`, which our extractor emits as CallsName) as a reference to the CLASS symbol (`Foo#`), so a covered constructor counted while a MISSED one — invisible to the callable-filtered uncovered side — never offset it, inflating Python recall. Fix: the covered side now also requires the matched symbol be callable, so both sides measure the same population (`).` method/function calls). Constructors are out of recall entirely because SCIP can't distinguish a constructor call from a type annotation of the same `Foo#` symbol. Rust/C are unaffected (their calls_name edges already match `).` symbols). Regression test added.

chatgpt-codex-connector · 2026-06-15T21:31:39Z

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

skakri · 2026-06-15T21:31:49Z

P2 — Count Python constructor references in recall ✅ Addressed in 40d12d2 — by making recall's two sides use the same callable population, which fixes the inflation you identified.

Root cause confirmed: covered_calls counted a calls_name edge matched to any SCIP occurrence, while count_uncovered_calls (the denominator) requires scip::symbol_is_callable ().). scip-python records a constructor call (Foo(), which our extractor emits as CallsName) as a reference to the class symbol (Foo#), so a covered constructor counted while a missed one — invisible to the callable-filtered uncovered side — never offset it → inflated recall.

The covered side now also requires symbol_is_callable, so both sides measure the same population (). method/function calls). I went with exclude-on-both-sides rather than include constructors because accurate inclusion isn't possible from SCIP alone: a Foo# occurrence is identical whether it's a constructor call (Foo()) or a type annotation (x: Foo), so counting # as callable on the uncovered side would over-count type refs as missed calls and deflate recall. Confining recall to ). callables is the well-defined, symmetric metric; constructor-call recall would need a call-role SCIP doesn't carry. Rust/C are unaffected (their calls_name edges already match ). symbols). Regression test covered_side_requires_a_callable_scip_symbol added.

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

skakri mentioned this pull request Jun 15, 2026

feat(oracle): unified tier-driven corpus runner + oracle.yml (C3) #177

Merged

chatgpt-codex-connector Bot reviewed Jun 15, 2026

View reviewed changes

skakri mentioned this pull request Jun 15, 2026

feat(oracle): oracle report --corpus <id> — run a corpus + emit its C2 resolution report (C2-CLI) #175

Merged

skakri merged commit 321c7cb into main Jun 15, 2026
10 checks passed

skakri deleted the feat/scip-python-oracle branch June 15, 2026 21:36

This was referenced Jun 15, 2026

chore: release v0.7.0 #161

Open

Multi-language SCIP oracle + Python (small-corpus runner) #164

Closed

Conversation

skakri commented Jun 15, 2026

What

Operational notes (documented in docs/oracle.md)

Verification

Stacking

Uh oh!

codecov Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

⚠️ WARNING: No Threshold found!

🚨 2 Alerts

Uh oh!

github-actions Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Bencher Report

Uh oh!

skakri commented Jun 15, 2026

Uh oh!

chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector Bot commented Jun 15, 2026

Uh oh!

skakri commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Operational notes (documented in `docs/oracle.md`)

codecov Bot commented Jun 15, 2026 •

edited

Loading

github-actions Bot commented Jun 15, 2026 •

edited

Loading

github-actions Bot commented Jun 15, 2026 •

edited

Loading