feat(resolve): Python from-import alias resolution scope (#174)#179
feat(resolve): Python from-import alias resolution scope (#174)#179skakri wants to merge 2 commits into
Conversation
`from m import T as A` then `A()` now resolves the alias use to the imported `T`, instead of staying unresolved or binding to an unrelated local `A`. Mirrors the Rust per-module use-scope machinery (#61) but REBINDS the alias to its in-corpus target rather than only flagging it external. - Extraction (python_import_target): a `from`-import `aliased_import` emits its Imports edge to the target (to_name=T, keeps the dependency) via file_edge_scoped, carrying the alias in `evidence` + a whole-file import scope. Module aliases (`import x as m`) are left alone (qualified-resolution problem). - ImportScope gains a python_aliases map (alias -> target, byte-scoped) + add_python_alias / python_alias_target; the import-edge load branches on language so Python imports register aliases instead of feeding add_use. - Resolution rebinds: before resolve_symbol, a non-import reference whose name is an in-scope alias resolves under the target name. Applied in BOTH the incremental and full-rebuild drivers (#61 both-driver parity). Verified e2e (alias resolves to the import target, not a decoy local) + unit test in resolve/tests.rs.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
SCIP oracle — resolution reportHeuristic→compiler edge resolution per corpus. Δ compares resolved-after to the
resolved = |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a9ff258aed
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if candidate.edge_kind == EdgeKind::Imports { | ||
| let evidence = arena.get_opt(candidate.evidence).unwrap_or(""); | ||
| import_scope.add_use(*file_id, evidence, candidate.import_scope_range()); | ||
| if index.file_language.get(file_id).copied() == Some(Language::Python.as_str()) { |
There was a problem hiding this comment.
Use file language when building full-rebuild aliases
In the full-rebuild path this checks index.file_language, but that map is populated only from files that contributed at least one symbol. A common Python entrypoint with only from models import User as Account and Account() has no symbol row for its own file, so this branch is skipped under index --full; the alias carrier feeds neither add_python_alias nor add_use, and the call remains unresolved or binds an unrelated Account. The incremental driver joins files.language, so the two paths diverge here.
Useful? React with 👍 / 👎.
| let scope = | ||
| ImportScopeRange { scope_start: 0, scope_end: text.len(), mod_id: MOD_FILE_ROOT }; |
There was a problem hiding this comment.
Don't give function-local Python imports file-wide scope
When a from ... import ... as ... appears inside a function or class body, Python binds the alias only in that local/class scope, but this records the alias as covering the entire file. In that scenario, any unrelated top-level or sibling-scope reference with the same name (including references before the import statement) is now rebound to the imported target and can override the correct local symbol. The scope needs to be derived from the containing Python scope/order, or non-module imports should be skipped.
Useful? React with 👍 / 👎.
| if let (Some(alias), Some(target)) = (evidence, to_name) { | ||
| import_scope.add_python_alias(file_id, alias, target, scope); |
There was a problem hiding this comment.
Preserve the imported module when rebinding aliases
This registers alias -> target using only the imported symbol's short name, discarding the from module that selected the target. If app.py contains from models import User as Account and also defines its own User, a later Account() is resolved as bare User; resolve_symbol can then take the same-file fallback and bind to app.py::User instead of models.py::User. The alias binding needs enough module/qualified context to disambiguate the imported target.
Useful? React with 👍 / 👎.
| import_scope | ||
| .python_alias_target(source_file_id, short_name(&to_name), ref_byte) | ||
| .unwrap_or(to_name.as_str()) |
There was a problem hiding this comment.
Stop rebinding aliases after same-module shadowing
This substitution is unconditional for every later edge whose short name matches the alias, so it also overrides Python's normal rebinding in the same module. For example, after from models import User as Account; class Account: ...; Account(), the class definition assigns Account in module scope, but the call is still resolved under User instead of the same-file Account. The alias range needs to end at, or be shadowed by, subsequent same-name bindings rather than applying to all references in the file.
Useful? React with 👍 / 👎.
|
| Branch | feat/python-import-alias-scope |
| Testbed | ubuntu-latest |
⚠️ WARNING: No Threshold found!Without a Threshold, no Alerts will ever be generated.
- Estimated Cycles (cycles)
- Instructions (instructions)
- L1 Hits (hits)
- LL Hits (hits)
- RAM Hits (hits)
- Total read+write (reads/writes)
Click here to create a new Threshold
For more information, see the Threshold documentation.
To only post results if a Threshold exists, set the--ci-only-thresholdsflag.
Click to view all benchmark results
| Benchmark | Estimated Cycles | cycles x 1e6 | Instructions | instructions x 1e6 | L1 Hits | hits x 1e6 | LL Hits | hits x 1e6 | RAM Hits | hits x 1e3 | Total read+write | reads/writes x 1e6 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| rag_pipeline::pipeline::index cargo_resolver:resolver_config() | 📈 view plot | 1,213.57 x 1e6 | 📈 view plot | 801.18 x 1e6 | 📈 view plot | 1,116.54 x 1e6 | 📈 view plot | 17.26 x 1e6 | 📈 view plot | 307.07 x 1e3 | 📈 view plot | 1,134.10 x 1e6 |
| rag_pipeline::pipeline::query_cold cargo_resolver:resolver_built_config() | 📈 view plot | 241.18 x 1e6 | 📈 view plot | 154.70 x 1e6 | 📈 view plot | 231.05 x 1e6 | 📈 view plot | 1.95 x 1e6 | 📈 view plot | 11.21 x 1e3 | 📈 view plot | 233.01 x 1e6 |
| rag_pipeline::pipeline::query_warm cargo_resolver:resolver_index() | 📈 view plot | 233.00 x 1e6 | 📈 view plot | 149.19 x 1e6 | 📈 view plot | 223.28 x 1e6 | 📈 view plot | 1.85 x 1e6 | 📈 view plot | 13.32 x 1e3 | 📈 view plot | 225.14 x 1e6 |
|
| Branch | feat/python-import-alias-scope |
| Testbed | ubuntu-latest |
Click to view all benchmark results
| Benchmark | Latency | Benchmark Result seconds (s) (Result Δ%) | Upper Boundary seconds (s) (Limit %) |
|---|---|---|---|
| index_time/full_rebuild_cargo | 📈 view plot 🚷 view threshold | 4.86 s(+29.74%)Baseline: 3.75 s | 5.79 s (84.03%) |
Four Codex P2s on the alias rebind, all from the blunt whole-file/short-name approach: - Use the DB file language, not index.file_language, when building aliases in the full-rebuild driver: index.file_language is populated only from files that contributed a symbol, so a symbol-less entrypoint (only `from m import X as A; A()`) was skipped under `index --full` and the alias never registered. Both drivers now key on the authoritative `files.language`. - Defer to a same-file definition: rebind `alias -> target` only when the file defines NEITHER the alias name NOR the target's bare name locally. A local `class Account` now shadows the alias; a local `User` isn't grabbed by the bare rewrite. New SymbolIndex::file_defines + python_alias_rebind. - Register only TOP-LEVEL from-imports: a `from … import … as …` nested in a function/class binds the alias locally, so giving it whole-file scope would rebind unrelated same-name refs. Extraction now records the alias only when the import's parent is the module. Tests: same-file shadow defers to the local class; e2e confirms a symbol-less entrypoint resolves the alias under --full and a local class shadows it. Full core suite (483) green.
|
All four addressed in 447e5e2. P2 — Use file language in the full-rebuild path ✅ The full-rebuild driver now builds a P2 — Stop rebinding after same-module shadowing & P2 — Preserve the imported module ✅ Both handled by deferring to a same-file definition: the rebind now fires only when the file defines neither the alias name nor the target's bare name locally (new P2 — Don't give function-local imports file-wide scope ✅ Extraction now records the alias only for top-level Full |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 447e5e2115
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| let scope = | ||
| ImportScopeRange { scope_start: 0, scope_end: text.len(), mod_id: MOD_FILE_ROOT }; |
There was a problem hiding this comment.
Start Python alias scopes at the import statement
Even for a top-level from ... import ... as ..., the alias is not bound before that statement executes, but this scope starts at byte 0. In a file with Account() before from models import User as Account, the earlier call is treated as covered by the alias and can be rebound to User, corrupting edges for pre-import references; the alias scope should begin after the import statement rather than covering the whole file from the start.
Useful? React with 👍 / 👎.
| if index.file_defines(file_id, name) || index.file_defines(file_id, short_name(target)) { | ||
| return None; |
There was a problem hiding this comment.
Respect import order when checking alias shadowing
This suppresses the alias rebind whenever the file defines the alias name anywhere, regardless of whether that definition occurs before the import. For class Account: ...; from models import User as Account; Account(), Python's import reassigns Account, but this check skips the rebind and normal resolution prefers the same-file class Account, producing the opposite edge; the shadowing check needs to consider binding order/range, not just file-wide existence.
Useful? React with 👍 / 👎.
| python_alias_rebind( | ||
| &import_scope, | ||
| &index, | ||
| source_file_id, | ||
| short_name(&to_name), | ||
| ref_byte, |
There was a problem hiding this comment.
Do not rebind qualified Python attribute references
This rebinds solely from the leaf to_name, so a qualified annotation such as other.Account is also treated as the local alias Account after from models import User as Account. Because Python type-reference extraction stores only the dotted tail in to_name, that scenario now binds other.Account to User instead of the Account symbol; the alias rewrite should be limited to unqualified references, e.g. by checking the original evidence/qualified context.
Useful? React with 👍 / 👎.
| // (#174 review), so recording it with whole-file scope would rebind | ||
| // unrelated same-name references; for a nested import emit the plain target | ||
| // edge (dependency captured) without the alias. | ||
| let record_alias = node.parent().map(|p| p.kind()) == Some("module"); |
There was a problem hiding this comment.
Recognize module-scope imports inside top-level blocks
This only records aliases whose direct parent is module, but top-level if/try blocks still bind imports in the module namespace. Common patterns like try: from ujson import loads as json_loads; except ImportError: ...; json_loads(...) have the import node under a block, so the alias carrier is downgraded to a plain import and the later call is left unresolved even though the binding is module-scoped.
Useful? React with 👍 / 👎.
| .iter() | ||
| .find(|alias| alias.covers(ref_byte)) | ||
| .map(|alias| alias.target.as_str()) |
There was a problem hiding this comment.
Pick the latest covering Python alias binding
When the same alias is imported more than once, every top-level alias is stored with a file-wide range and this returns the first covering binding. In from m1 import User as Account; from m2 import Customer as Account; Account(), the later import reassigns Account, but resolution still uses the first User binding; overlapping alias bindings need order-aware ranges or a latest-start/last-binding winner.
Useful? React with 👍 / 👎.
Fixes #174.
from models import User as AccountthenAccount()now resolves the alias use to the importedUser, instead of staying unresolved or binding to an unrelated localAccount.Approach
Mirrors the Rust per-module
use-scope machinery (#61), but where Rust'sis_external_importonly suppresses a local same-named symbol, Python needs to rebind the alias to its in-corpus target. Three layers:python_import_target, gated tofrom-imports): analiased_importemits its Imports edge to the target (to_name = User— keeps the in-corpus dependency) viafile_edge_scoped, carrying the alias inevidenceand a whole-fileImportScopeRange(Python imports are module/file-global). The edge is the alias carrier. Module aliases (import x as m) are left alone — they're a qualified-resolution problem, out of scope.ImportScope): a newpython_aliasesmap (alias → target, byte-scoped) withadd_python_alias/python_alias_target. The import-edge load branches onfiles.language, so Python imports register aliases instead of feedingadd_use(which parses Rustusetext).resolve_symbolcallers): before resolution, a non-import reference whose name is an in-scope alias resolves under the target name. The storedto_namestays the surface alias; onlyto_symbol_idbecomes the target's.Applied in both resolution drivers (incremental
resolve_all_edgesand the full-rebuildCompactEdgeloop) per the #61 both-driver-parity rule —index --fulluses the full-rebuild path.Verification
python_from_import_alias_rebinds_to_the_imported_target(resolve/tests.rs): with a targetmodels.Userand a decoy localother.Account, theAccount()edge binds toUser.models.py::User, not the decoy).rag-rat-coresuite green;cargo +nightly fmt --check+ clippy clean. The import target itself was already captured, so existing import/resolution tests are unaffected.Scope
Symbol aliases via
from x import T as Aonly; whole-file scope (a safe over-approximation). Module aliases (import x as m; m.f()) remain a follow-up — sibling of the already-closed #172.