resolver: implement URL template substitution + conditional data block#8
Merged
Merged
Conversation
Phase 2.5(a) of the testing-maturity initiative. Fixes two related resolver bugs surfaced by the conformance suite added in SecID-Client-SDK PR #4: 1. URL templates pass through unrendered ({id}, {num}, {parent}/{sub} placeholders reach clients literally instead of being substituted). 2. Every URL-bearing result carries an extra `data` block dumping internal registry metadata into the response. The new _substitute_url_template() supports two layered modes, matching SecID-Service Worker's behavior: Mode 1: explicit variables Child match_node's data.variables = {name: {extract: <regex>}}. The extract regex matches the captured input; group 1 (or group 0 if no groups) supplies the variable's value. Example: CWE-79 + extract '^CWE-(\d+)$' -> num=79. Mode 2: implicit {id} Defaults to the captured input unless explicitly defined. Layered on top of Mode 1, not exclusive — both can fire if a template uses both {id} and other named placeholders. Example: CVE-2021-44228 -> {id}=CVE-2021-44228. Unrecognized {placeholders} are left as-is so registry data bugs are visible rather than silently swallowed. The data-block fix matches the canonical Worker contract: - URL-bearing results: no `data` field (URL is the result) - Description-only results: `data` field present (no URL means `data` is the only meaningful content) Tests: python/test_smoke.py — 6 new unit tests covering each substitution mode plus the unrecognized-placeholder edge case. 19/19 pass. Conformance against the canonical Worker: Before this PR: 3/10 pass against Python Server-API After this PR: 7/10 pass Newly passing: - resolve-cve-2021-44228 (implicit {id} + no data field) - resolve-cwe-79 (explicit {num} via extract) - resolve-attck-t1059-003 (multiple explicit variables) - resolve-aiid-1 (implicit {id}) Still failing (Phase 2.5c+d — separate PRs): - resolve-cve-cross-source (bare-namespace cross-source search) - list-advisory-namespaces (bare-type discovery) - list-types (/api/v1/types endpoint missing) Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Phase 2.5(a) — fixes the two highest-impact bugs in the Python Server-API resolver, both surfaced by SecID-Client-SDK PR #4's conformance suite.
Bug 1: URL templates pass through unrendered
`{id}`, `{num}`, `{parent}`, `{sub}` placeholders reach clients literally instead of being substituted. Affected 4 conformance fixtures.
Bug 2: `data` block leaks on URL-bearing results
Every URL result dumped internal registry metadata (description, examples, schema info) into the response. The canonical Worker only includes `data` when there's no URL.
Implementation
New `_substitute_url_template(template, child_data, captured_input)` helper. Two layered modes mirroring SecID-Service:
Mode 1 — explicit variables:
```json
"variables": {
"num": { "extract": "^CWE-(\d+)$" }
}
```
Extract regex matched against input; group 1 supplies the value.
Mode 2 — implicit `{id}`:
Defaults to the captured input. Layered on top of Mode 1, not exclusive.
Unrecognized `{placeholders}` are left visible so registry data bugs are diagnosable.
`data` conditional: included only when there's no URL (canonical Worker contract).
Conformance progression
Newly passing:
Still failing (independent Phase 2.5 follow-up PRs):
Verification
```
$ pytest test_smoke.py -v
19/19 passed (13 prior + 6 new substitution unit tests)
$ python3 tests/conformance-harness/python/run.py --target http://localhost:8765
Results: 7 passed, 3 failed (out of 10)
```
New unit tests
🤖 Generated with Claude Code