feat(oracle): scip-python backend — Python compiler-grade resolution (B6)#176
Conversation
…(B6) Adds OracleTool::ScipPython to the oracle tool registry so the SCIP oracle resolves Python edges (#164 B6, unblocks `oracle report --corpus py-requests`). - Manifest entry: program `scip-python`, language `python`, install hint. - Capability probe: `scip-python index --help` exits 0 (analog of rust-analyzer's `scip --help`); `--version` reports the tool version. - Invocation: `scip-python index --project-name <root-basename> --cwd <root> --output <abs>`. The project name gives in-corpus symbols a non-empty moniker package; `--cwd` is where it resolves the project + installed deps. - No compile_commands.json-style prerequisite: the deps install is the corpus `prepare` step's job, and a failed environment shows up as a near-zero moniker count the report health gate already catches. - CLI `--tool scip-python` + exhaustive enum coverage; argv unit test. The SCIP occurrence->edge join is language-agnostic, so no join changes were needed. Verified end-to-end on a small Python project: 5 edges examined, 4 confirmed, 1 resolved-external, 4 monikers written.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ffc151677a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /// `scip-python index` — Python (#164 B6). Resolves imports against the project's INSTALLED | ||
| /// dependencies, so the corpus must install them (a venv) first; an unresolved environment | ||
| /// shows up as a near-zero moniker count the report's health gate catches. | ||
| ScipPython, |
There was a problem hiding this comment.
Route scip-python through graph surfacing
When users run rag-rat oracle run --tool scip-python on a Python repo, the verdicts are persisted under the new scip-python tool, but the graph read paths still only query OracleTool::RustAnalyzer in enrich_hops_with_oracle and compare_graph_to_scip (crates/rag-rat-core/src/index/query_api/graph.rs). That means the new backend can complete and show up in status while trace_callees/find_callers/SCIP comparisons never see its rows, so Python edges do not get the advertised compiler tier. Please route surfacing/comparison through the tool registry or select the oracle tool for the edge language before exposing this variant.
Useful? React with 👍 / 👎.
| cmd.arg("index") | ||
| .arg("--project-name") | ||
| .arg(project_name) |
There was a problem hiding this comment.
Pin scip-python project version
This invocation never passes --project-version, so scip-python falls back to the checkout's git revision (the new docs note this behavior). Because rag-rat stores the full SCIP symbol string in logical_symbol_monikers and later resolves memories by exact moniker for the same tool, every commit changes all Python monikers even when symbols did not move, breaking the advertised stable memory relocation for scip-python runs. Pass a stable project version (for example the conventional _) when invoking the indexer.
Useful? React with 👍 / 👎.
| /// Every known oracle tool, for "report on all tools" surfaces (`oracle status` with no | ||
| /// `--tool`). Later language backends (#72 Kotlin) extend this alongside the enum. | ||
| pub const ALL: &[OracleTool] = &[Self::RustAnalyzer, Self::ScipClang]; | ||
| pub const ALL: &[OracleTool] = &[Self::RustAnalyzer, Self::ScipClang, Self::ScipPython]; |
There was a problem hiding this comment.
Filter auto-run tools by configured language
Adding ScipPython to OracleTool::ALL also feeds the MCP background auto-run loop, which iterates OracleTool::ALL without checking config.targets. In an auto-run repo where scip-python is installed but Python is not indexed, the server will still invoke the Python indexer against the checkout; if it fails, the error is swallowed and no oracle_runs row is recorded, so the loop retries on every poll. Keep the status registry broad, but gate background runs to tools whose languages are present in the config.
Useful? React with 👍 / 👎.
|
| Branch | feat/scip-python-oracle |
| Testbed | ubuntu-latest |
⚠️ WARNING: No Threshold found!Without a Threshold, no Alerts will ever be generated.
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the--ci-only-thresholdsflag.
🚨 2 Alerts
| Benchmark | Measure Units | View | Benchmark Result (Result Δ%) | Upper Boundary (Limit %) |
|---|---|---|---|---|
| rag_pipeline::pipeline::index cargo_resolver:resolver_config() | Instructions instructions x 1e6 | 📈 plot 🚷 threshold 🚨 alert (🔔) | 800.96 x 1e6(+40.05%)Baseline: 571.89 x 1e6 | 583.33 x 1e6 (137.31%) |
| rag_pipeline::pipeline::query_cold cargo_resolver:resolver_built_config() | Instructions instructions x 1e6 | 📈 plot 🚷 threshold 🚨 alert (🔔) | 154.52 x 1e6(+15.92%)Baseline: 133.30 x 1e6 | 135.96 x 1e6 (113.65%) |
Click to view all benchmark results
| Benchmark | Estimated Cycles | cycles x 1e6 | Instructions | Benchmark Result instructions x 1e6 (Result Δ%) | Upper Boundary instructions x 1e6 (Limit %) | L1 Hits | hits x 1e6 | LL Hits | hits x 1e6 | RAM Hits | hits x 1e3 | Total read+write | reads/writes x 1e6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| rag_pipeline::pipeline::index cargo_resolver:resolver_config() | 📈 view plot | 1,214.51 x 1e6 | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 800.96 x 1e6(+40.05%)Baseline: 571.89 x 1e6 | 583.33 x 1e6 (137.31%) | 📈 view plot | 1,116.31 x 1e6 | 📈 view plot | 17.47 x 1e6 | 📈 view plot | 309.53 x 1e3 | 📈 view plot | 1,134.09 x 1e6 |
| rag_pipeline::pipeline::query_cold cargo_resolver:resolver_built_config() | 📈 view plot | 240.97 x 1e6 | 📈 view plot 🚷 view threshold 🚨 view alert (🔔) | 154.52 x 1e6(+15.92%)Baseline: 133.30 x 1e6 | 135.96 x 1e6 (113.65%) | 📈 view plot | 230.81 x 1e6 | 📈 view plot | 1.95 x 1e6 | 📈 view plot | 11.74 x 1e3 | 📈 view plot | 232.77 x 1e6 |
| rag_pipeline::pipeline::query_warm cargo_resolver:resolver_index() | 📈 view plot | 232.96 x 1e6 | 📈 view plot 🚷 view threshold | 149.14 x 1e6(-9.07%)Baseline: 164.02 x 1e6 | 167.30 x 1e6 (89.14%) | 📈 view plot | 223.15 x 1e6 | 📈 view plot | 1.87 x 1e6 | 📈 view plot | 13.20 x 1e3 | 📈 view plot | 225.03 x 1e6 |
|
| Branch | feat/scip-python-oracle |
| Testbed | ubuntu-latest |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result seconds (s) (Result Δ%) | Upper Boundary seconds (s) (Limit %) |
|---|---|---|---|
| index_time/full_rebuild_cargo | 📈 view plot 🚷 view threshold | 4.84 s(+30.31%)Baseline: 3.72 s | 5.77 s (84.00%) |
- Route surfacing through ALL backends, not just rust-analyzer (P2): graph read paths hardcoded OracleTool::RustAnalyzer, so scip-python (and scip-clang) verdicts never surfaced as the Compiler tier in find_callers/trace_callees or compare_graph_to_scip. New oracle::latest_runs_in_scope enumerates every tool with a run in the checkout; enrich_hops_with_oracle merges their verdicts and compare_graph_to_scip aggregates across them (verdict sets are disjoint by edge language). Regression test: a scip-clang verdict now surfaces + the report names the contributing tool. - Pin scip-python --project-version _ (P2): it otherwise defaults to the git revision, which is embedded in every SCIP symbol, so every commit churned all Python monikers and broke moniker-anchored memory relocation. The constant keeps monikers stable across commits (and fixes the crash on a non-git checkout, verified e2e). - Gate background auto-run by configured language (P2): the MCP auto-run loop iterated OracleTool::ALL without checking config.targets, so scip-python installed + Python not indexed would invoke the Python indexer every poll, fail, swallow the error, record no run row, and retry forever. Now skips a backend whose language the checkout doesn't index (status registry stays broad).
|
Addressed in 84cc401. P2 — Route scip-python through graph surfacing ✅ Fixed (and it covered the pre-existing scip-clang gap too). P2 — Pin scip-python project version ✅ Fixed. P2 — Filter auto-run tools by configured language ✅ Fixed. The MCP background auto-run loop now skips any backend whose languages aren't in
|
There was a problem hiding this comment.
💡 Codex Review
In a mixed-language checkout, this aggregate check only warns when every backend has zero current verdicts. If, for example, rust-analyzer has valid verdicts but the new scip-python run produced 0 in-scope rows because the Python indexer saw the wrong paths or all Python files drifted, summary.verdicts_examined is still nonzero and compare_graph_to_scip silently reports no issue for the empty Python backend. Track the per-tool comparison count and warn for each run that contributes 0 rows so one healthy backend does not mask a broken one.
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| /// `scip-python index` — Python (#164 B6). Resolves imports against the project's INSTALLED | ||
| /// dependencies, so the corpus must install them (a venv) first; an unresolved environment | ||
| /// shows up as a near-zero moniker count the report's health gate catches. | ||
| ScipPython, |
There was a problem hiding this comment.
Count Python constructor references in recall
Enabling ScipPython sends Python runs through the existing recall-gap code, but count_uncovered_calls only considers SCIP symbols ending in ). callable via scip::symbol_is_callable. scip-python represents class constructor call references such as ClientSession() as references to the class/type symbol (...ClientSession#), while our Python extractor explicitly emits these as CallsName, so constructor calls that the heuristic misses never increment oracle_only_calls; this inflates Python recall and can make the new py-requests oracle report look healthier than it is. Please make the callable filter language/tool-aware before treating scip-python reports as compiler-grade metrics.
Useful? React with 👍 / 👎.
…176) Codex follow-up: the recall numerator (covered_calls) counted a calls_name edge matched to ANY SCIP occurrence, while the denominator's uncovered side requires scip::symbol_is_callable (a `).` method/function symbol). scip-python records a constructor call (`Foo()`, which our extractor emits as CallsName) as a reference to the CLASS symbol (`Foo#`), so a covered constructor counted while a MISSED one — invisible to the callable-filtered uncovered side — never offset it, inflating Python recall. Fix: the covered side now also requires the matched symbol be callable, so both sides measure the same population (`).` method/function calls). Constructors are out of recall entirely because SCIP can't distinguish a constructor call from a type annotation of the same `Foo#` symbol. Rust/C are unaffected (their calls_name edges already match `).` symbols). Regression test added.
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
P2 — Count Python constructor references in recall ✅ Addressed in 40d12d2 — by making recall's two sides use the same callable population, which fixes the inflation you identified. Root cause confirmed: The covered side now also requires |
Part of the multi-language SCIP-oracle runner epic (#164). Adds the B6 scip-python backend so the SCIP oracle resolves Python edges to the
Compilertier — unblockingoracle report --corpus py-requests(the Python corpus was already declared intools/oracle-corpora.tomlfrom #171, and #175 added the report command; this makes its tool runnable).What
OracleTool::ScipPythonadded to the oracle tool registry:scip-python, languagepython, install hint.scip-python index --helpexits 0 (the analog of rust-analyzer'sscip --helpcheck);--versionreports the tool version.scip-python index --project-name <root-basename> --cwd <root> --output <abs>. The project name gives in-corpus symbols a non-empty moniker package (what letscount_symbols_with_monikersee them);--cwdis where scip-python resolves the project + its installed deps.compile_commands.json-style prerequisite: installing deps is the corpuspreparestep's job, and a failed environment shows up as a near-zero moniker count the report health gate already catches — soprerequisite_blockedisNone.--tool scip-python+ exhaustive enum coverage; argv unit test.The SCIP occurrence→edge join is language-agnostic (keyed on document paths + positions), so no join changes were needed.
Operational notes (documented in
docs/oracle.md)--project-versionfrom the git revision, so it expects a git checkout — real corpora are cloned at a pinned rev, so this is fine; a non-git dir makes it crash on an undefined version.Verification
scip_python_indexes_a_cwd_with_a_project_name; the existing exhaustiveevery_tool_has_a_manifest_entrynow covers the third variant.oracle status --tool scip-python→available 0.6.6.cargo +nightly fmt --check,cargo clippy --all-targets, and the oracle core + cli suites are green.Stacking
Independent of #175 (C2-CLI) at the code level — both branch off main and touch disjoint code. Either can merge first; once both land,
oracle report --corpus py-requestsis fully runnable.