Add tests for ModelCapabilities InferFromName and fuzzy prefix matching#775
Add tests for ModelCapabilities InferFromName and fuzzy prefix matching#775
Conversation
- Add [Theory][InlineData] tests for InferFromName covering all 7 pattern branches: opus, sonnet, haiku, gemini family; codex, mini, max variants - Add fuzzy prefix matching tests for GetCapabilities: versioned slugs like 'claude-opus-4.6-1m' and 'claude-sonnet-4.5-preview' resolve to registry entries; 'gpt-4.1-preview' unambiguously matches 'gpt-4.1' - Add GetStrengths tests for all four code paths: registry hit, fuzzy match, inferred (returns 'Inferred: ...' prefix with content), and fully unknown (returns 'Unknown model') - Document the 'gemini'/'mini' substring quirk in InferFromName via test comment Co-authored-by: Copilot <[email protected]>
🟢 Design-level finding — Near-duplicate coverage with
|
There was a problem hiding this comment.
Good test coverage for InferFromName, fuzzy prefix matching, and GetStrengths — all 39 tests correctly reflect actual production behavior. Three findings below (0 critical, 1 moderate, 2 minor).
Generated by Expert Code Review (auto) for issue #775 · ● 5.9M
| [Theory] | ||
| [InlineData("claude-sonnet-4.5-20250514")] | ||
| [InlineData("claude-sonnet-4.5-preview")] | ||
| public void GetCapabilities_FuzzyPrefix_SonnetVariant_ReturnsSonnetFlags(string slug) | ||
| { | ||
| var caps = ModelCapabilities.GetCapabilities(slug); | ||
| // claude-sonnet-4.5 registry entry: CodeExpert|ToolUse|Fast | ||
| Assert.True(caps.HasFlag(ModelCapability.CodeExpert)); | ||
| Assert.True(caps.HasFlag(ModelCapability.ToolUse)); | ||
| Assert.True(caps.HasFlag(ModelCapability.Fast)); | ||
| } |
There was a problem hiding this comment.
🟢 MINOR — Fuzzy-match test relies on non-deterministic Dictionary iteration order
"claude-sonnet-4.5-20250514".StartsWith("claude-sonnet-4") is true, so this slug matches both the "claude-sonnet-4.5" and "claude-sonnet-4" registry keys in the foreach loop. The code returns whichever the Dictionary iterator yields first — which isn't guaranteed across .NET versions.
Both entries have identical caps today (CodeExpert | ToolUse | Fast), so this is safe for now. If they ever diverge, this test becomes flaky.
Suggestion: Add a comment noting the assumption:
// Both claude-sonnet-4.5 and claude-sonnet-4 share identical caps today;
// if they diverge, fuzzy matching will need longest-prefix-wins logic.| } | ||
|
|
||
| [Fact] | ||
| public void GetCapabilities_FuzzyPrefix_GptMiniVariant_ReturnsGpt4Flags() |
There was a problem hiding this comment.
🟢 MINOR — Misleading test method name
This tests "gpt-4.1-preview" fuzzy-matching to "gpt-4.1", but the name says "GptMiniVariant" — there's no "mini" in the slug. Suggest renaming to GetCapabilities_FuzzyPrefix_GptPreviewVariant_ReturnsGpt41Flags.
| // Note: "gemini" contains "mini" as a substring, so the mini-variant logic also fires, | ||
| // adding Fast|CostEfficient. This is expected production code behaviour. | ||
| Assert.True(caps.HasFlag(ModelCapability.Fast)); | ||
| Assert.True(caps.HasFlag(ModelCapability.CostEfficient)); |
There was a problem hiding this comment.
🟡 MODERATE — Test cements a production code bug as "expected behaviour"
InferFromName (ModelCapabilities.cs:80) uses lower.Contains("mini"), which falsely matches "ge**mini**". Calling this "expected production code behaviour" discourages fixing it, and any future fix to use word-boundary matching would break these assertions.
Suggestion: Reword the comment to acknowledge this as a known limitation:
// Known issue: "gemini" contains "mini" as a substring, causing the mini-variant
// logic to also fire. If InferFromName is updated to use word-boundary matching,
// these two asserts should be removed.| // Note: "gemini" contains "mini" as a substring, so the mini-variant logic also fires, | ||
| // adding Fast|CostEfficient. This is expected production code behaviour. | ||
| Assert.True(caps.HasFlag(ModelCapability.Fast)); | ||
| Assert.True(caps.HasFlag(ModelCapability.CostEfficient)); |
There was a problem hiding this comment.
🟡 MODERATE — Test locks in a substring-collision defect as "expected behaviour"
"gemini" contains "mini" at positions 2–5, so the production code's lower.Contains("mini") check in InferFromName (ModelCapabilities.cs:80) fires on every Gemini-family slug. This incorrectly tags premium Gemini models with Fast | CostEfficient — flags that semantically mean "cheap/lightweight."
The test comment calls this "expected production code behaviour," which effectively approves the defect and will prevent future maintainers from recognising it as a bug. Any novel Gemini model inferred from its name will be misclassified, potentially skewing role-assignment logic in GetRoleWarnings.
Suggestion: Reword the test comment to acknowledge this as a known defect (e.g., // KNOWN BUG: "gemini" contains "mini" substring — see #<issue>), and consider a production fix using word-boundary matching (e.g., Regex.IsMatch(lower, @"\bmini\b") or checking for -mini/_mini tokens).
Flagged by: 3/3 reviewers
| } | ||
|
|
||
| [Fact] | ||
| public void GetCapabilities_FuzzyPrefix_GptMiniVariant_ReturnsGpt4Flags() |
There was a problem hiding this comment.
🟢 MINOR — Misleading test name
GetCapabilities_FuzzyPrefix_GptMiniVariant_ReturnsGpt4Flags tests "gpt-4.1-preview", which is a preview variant, not a mini variant. The name will mislead future maintainers into thinking this covers mini-model behaviour.
Suggestion: Rename to GetCapabilities_FuzzyPrefix_GptPreviewVariant_ReturnsGpt41Flags (or similar) to match the actual input slug.
Flagged by: 2/3 reviewers
| [Theory] | ||
| [InlineData("claude-sonnet-4.5-20250514")] | ||
| [InlineData("claude-sonnet-4.5-preview")] | ||
| public void GetCapabilities_FuzzyPrefix_SonnetVariant_ReturnsSonnetFlags(string slug) |
There was a problem hiding this comment.
🟢 MINOR — Fuzzy-prefix test relies on Dictionary iteration order
"claude-sonnet-4.5-preview" prefix-matches two registry keys: "claude-sonnet-4" and "claude-sonnet-4.5". The foreach over the internal Dictionary<string, ...> has no guaranteed iteration order in .NET. This test passes today because both keys happen to share identical capability flags (CodeExpert | ToolUse | Fast), but if either entry is updated independently, this test could become non-deterministically fragile.
Suggestion: Either use test inputs that unambiguously match a single registry key, or add a comment documenting that correctness depends on both keys sharing the same flags.
Flagged by: 2/3 reviewers
|
/review |
2 similar comments
|
/review |
|
/review |
|
✅ Expert Code Review completed successfully! |
There was a problem hiding this comment.
Expert Code Review — Adversarial Consensus
Good test coverage for InferFromName, fuzzy prefix matching, and GetStrengths. The 39 new tests correctly reflect production behavior and follow the project's conventions. Three new findings below (supplementing the existing review's findings on gemini/mini collision, GptMiniVariant naming, and Dictionary iteration order).
Findings
| # | Severity | Consensus | File | Finding |
|---|---|---|---|---|
| 1 | 🟡 MODERATE | 3/3 reviewers | SessionOrganizationTests.cs:2554 |
GetStrengths_FuzzyMatchedModels assertions don't verify correct registry mapping — fuzzy result should be compared to canonical exact-slug result |
| 2 | 🟢 MINOR | 3/3 reviewers | SessionOrganizationTests.cs:2451 |
Variant-only tests (Codex, Mini, Max) use HasFlag instead of Assert.Equal — can't detect spurious flags |
| 3 | 🟢 MINOR | 2/3 reviewers | SessionOrganizationTests.cs:2544 |
GetStrengths_KnownModels uses only negative assertions — can't catch wrong-entry regressions (1 reviewer argued positive assertions couple to copy edits) |
Discarded (single reviewer, no follow-up consensus): Gemini test method name omits 2 of 5 validated flags.
Assessment
- Test coverage: Comprehensive — all
InferFromNamefamily/variant branches, fuzzy prefix matching, and all 4GetStrengthspaths are covered - Test correctness: All tests accurately reflect current production behavior
- CI status: PR description reports all 39 tests pass, build clean
- Risk: Low — test-only change, no production code modified
Methodology: 3 independent reviewers with adversarial consensus (initial review + targeted follow-ups for disputed findings)
Warning
⚠️ Firewall blocked 1 domain
The following domain was blocked by the firewall during workflow execution:
api.nuget.org
To allow these domains, add them to the network.allowed list in your workflow frontmatter:
network:
allowed:
- defaults
- "api.nuget.org"See Network Configuration for more information.
Generated by Expert Code Review for issue #775 · ● 10.2M
| public void InferFromName_CodexVariant_AddsCodeExpert() | ||
| { | ||
| var caps = ModelCapabilities.InferFromName("new-model-codex"); | ||
| Assert.True(caps.HasFlag(ModelCapability.CodeExpert)); |
There was a problem hiding this comment.
🟢 MINOR — Variant-only tests use HasFlag instead of Assert.Equal (3/3 reviewers)
These inputs ("new-model-codex", "new-model-mini", "new-model-max") don't match any family branch, so the expected flags are fully deterministic (e.g., codex → exactly CodeExpert). HasFlag can't detect regressions that add spurious flags — for example, if a code change accidentally added ToolUse to codex variants, this test would still pass.
Suggestion: Use Assert.Equal to match the pattern of the family tests:
Assert.Equal(ModelCapability.CodeExpert, caps);Same for the mini and max variant tests.
| Assert.NotEmpty(strengths); | ||
| Assert.DoesNotContain("Unknown", strengths, StringComparison.OrdinalIgnoreCase); | ||
| Assert.DoesNotContain("Inferred", strengths, StringComparison.OrdinalIgnoreCase); |
There was a problem hiding this comment.
🟢 MINOR — Known-model strengths test uses only negative assertions (2/3 reviewers)
NotEmpty + DoesNotContain("Unknown") + DoesNotContain("Inferred") confirms the registry-match path was taken, but can't catch a regression where GetStrengths returns the wrong registry entry's text. One reviewer disagreed, noting that adding positive assertions on prose content couples tests to copy edits — a reasonable tradeoff concern.
Suggestion (optional): If desired, add one keyword per model: e.g., Assert.Contains("reasoning", ...) for opus, Assert.Contains("cost", ...) for haiku. Alternatively, accept the current level as sufficient for a "path coverage" test.
| var strengths = ModelCapabilities.GetStrengths(slug); | ||
| Assert.NotEmpty(strengths); | ||
| Assert.DoesNotContain("Unknown", strengths, StringComparison.OrdinalIgnoreCase); | ||
| Assert.DoesNotContain("Inferred", strengths, StringComparison.OrdinalIgnoreCase); |
There was a problem hiding this comment.
🟡 MODERATE — Fuzzy-match strengths test doesn't verify correct registry mapping (3/3 reviewers)
These negative-only assertions (NotEmpty, DoesNotContain("Unknown"), DoesNotContain("Inferred")) prove the result didn't fall through to the inferred/unknown fallback paths, but can't verify which registry entry was matched. If the fuzzy loop's iteration order changes and "claude-opus-4.6-1m" resolves to a different model's strengths, this test still passes.
Suggestion: Pin the fuzzy result to its expected canonical entry:
Assert.Equal(ModelCapabilities.GetStrengths("claude-opus-4.6"), strengths);
// (and similarly for "claude-sonnet-4.5-preview" → "claude-sonnet-4.5")This mirrors the pattern already used in GetCapabilities_FuzzyPrefix_MatchesSameAsExact.
|
/review |
1 similar comment
|
/review |
|
✅ Expert Code Review completed successfully! |
This comment has been minimized.
This comment has been minimized.
|
/review |
|
✅ Expert Code Review completed successfully! |
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Found 2 new findings (both 🟡) beyond the 6 already-known items:
-
🟡 Sonnet fuzzy-prefix test is a false positive —
GetCapabilities_FuzzyPrefix_SonnetVariant_ReturnsSonnetFlagspasses even if the fuzzy-matching loop is deleted, becauseInferFromNamereturns identical flags (CodeExpert|ToolUse|Fast) to the sonnet registry entries. The opus fuzzy tests don't have this problem (LargeContextdistinguishes the paths). A haiku variant test would provide real fuzzy-path coverage. -
🟡 Combined family+variant test is tautological —
InferFromName_SonnetCodexCombinedcan't detect whether the codex branch fires, because sonnet already providesCodeExpert. Using a haiku+codex combination would make the codex contribution observable.
No 🔴 bugs or regressions found. The remaining tests (opus fuzzy, InferFromName families, GetStrengths paths, unknown model) are well-constructed.
Generated by Expert Code Review for issue #775 · ● 11.8M
| Assert.True(caps.HasFlag(ModelCapability.ToolUse)); | ||
| Assert.True(caps.HasFlag(ModelCapability.Fast)); | ||
| } | ||
|
|
||
| [Fact] | ||
| public void GetCapabilities_FuzzyPrefix_GptMiniVariant_ReturnsGpt4Flags() | ||
| { | ||
| // "gpt-4.1" has Fast|CostEfficient|ToolUse; "gpt-4.1-preview" unambiguously fuzzy-matches it |
There was a problem hiding this comment.
🟡 False-positive test — cannot detect fuzzy-matching regressions
Both sonnet registry entries (claude-sonnet-4.5, claude-sonnet-4) have flags CodeExpert|ToolUse|Fast. InferFromName for any slug containing "sonnet" also returns CodeExpert|ToolUse|Fast (line 74 of ModelCapabilities.cs). The flag sets are identical.
This means if the entire fuzzy-prefix loop in GetCapabilities (lines 54-57) were deleted, these tests would still pass — InferFromName produces the exact same result. Compare with the opus fuzzy tests above, which do provide real coverage because LargeContext is in the registry entry but NOT in InferFromName's opus output.
Concrete failing scenario: Delete the foreach fuzzy loop from GetCapabilities → opus fuzzy tests fail (good), sonnet fuzzy tests still pass (bad — zero coverage of the code path they claim to test).
Suggested fix: Replace with a model whose registry entry diverges from InferFromName, or add an Assert.Equal against the exact registry value paired with a separate test proving InferFromName returns a different value. Alternatively, test a haiku variant (registry has ToolUse, InferFromName doesn't):
[Fact]
public void GetCapabilities_FuzzyPrefix_HaikuVariant_ReturnsRegistryFlags()
{
var caps = ModelCapabilities.GetCapabilities("claude-haiku-4.5-fast");
// Registry: Fast|CostEfficient|ToolUse; InferFromName: Fast|CostEfficient (no ToolUse)
Assert.True(caps.HasFlag(ModelCapability.ToolUse)); // only passes via fuzzy
}| { | ||
| // sonnet family gives CodeExpert|ToolUse|Fast; codex adds CodeExpert (idempotent via flags) | ||
| var caps = ModelCapabilities.InferFromName("sonnet-codex-experimental"); | ||
| Assert.True(caps.HasFlag(ModelCapability.CodeExpert)); | ||
| Assert.True(caps.HasFlag(ModelCapability.ToolUse)); | ||
| Assert.True(caps.HasFlag(ModelCapability.Fast)); | ||
| } | ||
|
|
||
| [Fact] |
There was a problem hiding this comment.
🟡 Tautological test — codex branch is unobservable
The sonnet family already provides CodeExpert (line 74 of ModelCapabilities.cs), and CodeExpert is the only flag the codex variant adds (line 79). So the |= from the codex branch is a no-op here — the test result is identical whether or not the codex variant code exists.
Concrete failing scenario: Delete line 79 (if (lower.Contains("codex")) caps |= ModelCapability.CodeExpert;) → this test still passes, because sonnet already supplied CodeExpert|ToolUse|Fast.
Suggested fix: Use a family that does NOT include CodeExpert, so the codex variant's contribution is observable:
[Fact]
public void InferFromName_HaikuCodexCombined_AddsBothFamilyAndVariantFlags()
{
// haiku → Fast|CostEfficient; codex → |= CodeExpert (observable addition)
var caps = ModelCapabilities.InferFromName("haiku-codex-v2");
Assert.True(caps.HasFlag(ModelCapability.Fast));
Assert.True(caps.HasFlag(ModelCapability.CostEfficient));
Assert.True(caps.HasFlag(ModelCapability.CodeExpert)); // fails if codex branch removed
}|
/review |
|
✅ Expert Code Review completed successfully! |
Expert Code Review — PR #775Methodology: 3 independent reviewers with adversarial consensus (2 follow-up rounds for disputed findings) Findings
Discarded Findings
CI Status⚪ Pending — no check runs reported yet. Test Coverage AssessmentThis is a test-only PR adding 39 new Overall AssessmentClean test-only PR with good coverage. The two moderate findings are about tests encoding existing production quirks (gemini/mini substring match, haiku missing ToolUse) as expected behavior without tracking comments — worth addressing to prevent future confusion, but not blocking.
Warning
|
There was a problem hiding this comment.
Reviewed the full production code in ModelCapabilities.cs and all 19 new tests. One finding:
🟡 Gemini "mini" substring contamination locked in by test — InferFromName line 80 uses Contains("mini") which matches the substring inside "gemini", incorrectly adding Fast|CostEfficient to all Gemini models. The test asserts these flags as "expected". Fix: use Contains("-mini") in production code and update the test accordingly.
No other regressions, bugs, security, data-loss, race conditions, or test-correctness issues found. All other assertions are accurate and robust against Dictionary iteration order.
Warning
⚠️ Firewall blocked 2 domains
The following domains were blocked by the firewall during workflow execution:
api.nuget.orgdc.services.visualstudio.com
To allow these domains, add them to the network.allowed list in your workflow frontmatter:
network:
allowed:
- defaults
- "api.nuget.org"
- "dc.services.visualstudio.com"See Network Configuration for more information.
Generated by Expert Code Review for issue #775 · ● 12.9M
| // adding Fast|CostEfficient. This is expected production code behaviour. | ||
| Assert.True(caps.HasFlag(ModelCapability.Fast)); | ||
| Assert.True(caps.HasFlag(ModelCapability.CostEfficient)); | ||
| } |
There was a problem hiding this comment.
🟡 Test locks in a production bug — "gemini" contains "mini" as a substring
The production code at ModelCapabilities.cs:80 uses lower.Contains("mini"), which matches the "mini" inside "ge**mini**". This causes every Gemini model inferred via InferFromName to incorrectly receive Fast | CostEfficient.
This test asserts those incorrect flags and documents it as "expected production code behaviour", which locks in the bug — any future fix to the variant matching would break these assertions.
Concrete impact: GetStrengths("gemini-4-pro") returns "Inferred: reasoning, fast, cost-efficient, multimodal, large context", mislabeling a hypothetical premium model as "fast, cost-efficient".
Suggested fix — in ModelCapabilities.cs line 80, change:
if (lower.Contains("mini")) caps |= ...to:
if (lower.Contains("-mini")) caps |= ...All real "mini" variant slugs use a -mini suffix (gpt-5-mini, gpt-5.1-codex-mini). Then update this test to not assert Fast/CostEfficient for Gemini models.
Expert Code Review — PR #775Methodology: 3 independent reviewers with adversarial consensus (+ 2 targeted follow-ups for 1 disputed finding) 3 findings posted as inline comments (1 🟡 moderate, 2 🟢 minor)
Assessment
|
There was a problem hiding this comment.
One genuine finding — the new GetStrengths tests for inferred models provide false confidence because ToolUse is inferred but never rendered in the strengths string.
GetStrengths silently drops ToolUse for all inferred models (Medium)
InferFromName sets ToolUse for opus (line 73) and sonnet (line 74), but the strengths rendering block (lines 100–106 of ModelCapabilities.cs) has no branch for ToolUse — it's the only capability flag without one. So GetStrengths("claude-opus-99") returns "Inferred: reasoning, code" with no mention of tool use, even though GetCapabilities("claude-opus-99") returns ReasoningExpert | CodeExpert | ToolUse.
The new test GetStrengths_InferredOpus_ContainsReasoningAndCode passes by checking "reasoning" and "code" but never catches the missing "tool" — adding Assert.Contains("tool", strengths, OrdinalIgnoreCase) would expose the bug.
Fix: add if (inferred.HasFlag(ModelCapability.ToolUse)) parts.Add("tool use"); to the rendering block, and add a "tool" assertion to the test.
Generated by Expert Code Review · ● 12.1M
|
|
||
| [Theory] | ||
| [InlineData("claude-sonnet-4.5-20250514")] | ||
| [InlineData("claude-sonnet-4.5-preview")] |
There was a problem hiding this comment.
🟡 MODERATE — Non-deterministic fuzzy match: claude-sonnet-4.5-preview matches two registry keys (2/3 reviewers)
"claude-sonnet-4.5-preview" satisfies StartsWith(key) for both "claude-sonnet-4.5" and "claude-sonnet-4" in the registry dictionary. Dictionary<> enumeration order is unspecified in .NET, so which key "wins" the fuzzy match is nondeterministic.
The test passes today only because both keys share identical flags (CodeExpert|ToolUse|Fast) and the same Strengths string. If either entry diverges in a future registry update, this test becomes unpredictably flaky.
The same ambiguity affects GetStrengths_FuzzyMatchedModels_ReturnRegistryStrengths (line 2551) which also uses "claude-sonnet-4.5-preview".
Fix: Use a slug that unambiguously matches exactly one registry key, or add an assertion that verifies the exact matched key.
| public void GetCapabilities_FuzzyPrefix_OpusVariant_ReturnsOpusFlags(string slug) | ||
| { | ||
| var caps = ModelCapabilities.GetCapabilities(slug); | ||
| // claude-opus-4.6 registry entry: ReasoningExpert|CodeExpert|ToolUse|LargeContext | ||
| Assert.True(caps.HasFlag(ModelCapability.ReasoningExpert)); |
There was a problem hiding this comment.
🟢 MINOR — Fuzzy-prefix tests use HasFlag instead of Assert.Equal (3/3 reviewers after follow-up)
All three fuzzy-prefix tests (OpusVariant, SonnetVariant, GptMiniVariant) use Assert.True(caps.HasFlag(...)), which only verifies minimum flags are present. If GetCapabilities starts returning extra unintended flags for fuzzy-prefix slugs, these tests would silently pass.
Since fuzzy-prefix resolution is expected to mirror a specific registry entry exactly, Assert.Equal(expected, caps) is the stronger and more appropriate contract check — consistent with the InferFromName family tests above which already use Assert.Equal.
Note: GetCapabilities_FuzzyPrefix_MatchesSameAsExact (line 2526) already uses Assert.Equal for opus, which is good — extending this pattern to all fuzzy-prefix tests would close the gap.
| } | ||
|
|
||
| [Fact] | ||
| public void GetCapabilities_FuzzyPrefix_GptMiniVariant_ReturnsGpt4Flags() |
There was a problem hiding this comment.
🟢 MINOR — Misleading test name GptMiniVariant (2/3 reviewers)
"gpt-4.1-preview" contains no "mini" substring and no mini-variant logic fires — this test exercises the basic prefix fuzzy path (gpt-4.1-preview → gpt-4.1). The name will confuse future readers searching for mini-variant coverage.
Suggestion: Rename to something like GetCapabilities_FuzzyPrefix_Gpt41Preview_ReturnsGpt41Flags.
There was a problem hiding this comment.
Expert Code Review: 1 finding posted inline (minor). See summary comment for details.
Generated by Expert Code Review · ● 2.7M
| Assert.True(caps.HasFlag(ModelCapability.CostEfficient)); | ||
| } | ||
|
|
||
| [Fact] | ||
| public void InferFromName_CodexVariant_AddsCodeExpert() | ||
| { |
There was a problem hiding this comment.
🟢 MINOR · 2/3 reviewers
Enshrining a production quirk as required test behavior. These assertions lock in the "gemini" → "mini" substring false-positive as expected, which means a future fix to use word-boundary matching in InferFromName would break these tests.
Suggestion: Consider splitting the gemini assertions — keep ReasoningExpert | LargeContext | Vision as the required core, and assert Fast | CostEfficient separately with a comment like // TODO: Known quirk — "gemini" contains "mini". Remove when InferFromName uses word-boundary matching. This way a future fix only needs to update the quirk assertion, not the core gemini test.
|
/review |
Summary
Adds comprehensive
[Theory]tests toModelCapabilitiesTestsinSessionOrganizationTests.cscovering three previously untested methods.Coverage added
InferFromName(7 branches):ReasoningExpert | CodeExpert | ToolUseCodeExpert | ToolUse | FastFast | CostEfficientReasoningExpert | LargeContext | Vision(+Fast | CostEfficientbecause "gemini" contains "mini" — documented in test comment)CodeExpertFast | CostEfficientReasoningExpertNoneGetCapabilitiesfuzzy prefix matching:claude-opus-4.6-1m,claude-opus-4.6-preview,claude-opus-4.6-extended→ resolve toclaude-opus-4.6flagsclaude-sonnet-4.5-preview,claude-sonnet-4.5-20250514→ resolve toclaude-sonnet-4.5flagsgpt-4.1-preview→ resolves togpt-4.1flags (unambiguous; no shorter registry key matches)GetStrengths(all 4 paths):"Inferred: ..."with content"Unknown model"Test results
All 39 new tests pass. Build clean.