skill-validator: restore 15K aggregate cap as the real Copilot CLI skill-menu budget#803
skill-validator: restore 15K aggregate cap as the real Copilot CLI skill-menu budget#803Evangelink wants to merge 4 commits into
Conversation
…opilot CLI skill-menu budget The per-plugin aggregate description cap had been raised 15,000 -> 20,000 -> 22,000 under the belief that 15K was 'a local repo policy, NOT a documented Copilot constraint'. That belief was wrong: the GitHub Copilot CLI renders the model-facing <available_skills> menu under a hard 15,000- char budget (the agent SDK's SKILL_CHAR_BUDGET, default 15e3, confirmed in CLI 1.0.36 and 1.0.61). Skills are listed alphabetically and emitted with their full <description> only until the budget is exhausted; every skill past the cut-off collapses to a bare name with no description and can no longer be reliably model-activated. Raising the validator cap merely masked this silent menu truncation — e.g. dotnet-test's run-tests and test-* skills stopped activating in plugin eval runs because they fell into the name-only overflow. Changes: - SkillProfiler.MaxAggregateDescriptionLength: 22,000 -> 15,000, with the comment rewritten to document the real Copilot CLI budget (and correct the prior 'not a documented constraint' claim). - CheckCommand aggregate now excludes skills marked 'disable-model-invocation: true' — the CLI drops those from the menu, so they do not consume the budget. This makes the cap satisfiable by hiding reference / agent-orchestrated primitives rather than only by trimming. - InvestigatingResults.md: document plugin-arm-only non-activation caused by skill-menu budget overflow, and how to fix it. Note: dotnet-test currently exceeds 15K and must be slimmed below it (via disable-model-invocation on reference/primitive skills plus description trims) before this cap can go green repo-wide. Co-authored-by: Copilot <[email protected]>
|
Note This PR is from a fork and modifies infrastructure files ( Changes to infrastructure typically need to be submitted from a branch in Please consider recreating this PR from an upstream branch. If you don't have push access to |
There was a problem hiding this comment.
Pull request overview
Aligns skill-validator’s per-plugin aggregate description cap with the Copilot CLI’s effective 15,000-character skill-menu budget, and updates validation/docs to prevent silent <available_skills> truncation from masking plugin-arm non-activation.
Changes:
- Restores
SkillProfiler.MaxAggregateDescriptionLengthto 15,000 and rewrites the rationale/commentary to reflect the Copilot CLI menu budget behavior. - Updates
checkto exclude skills withdisable-model-invocation: truefrom the aggregate description total (matching CLI menu behavior). - Documents “plugin-arm-only non-activation due to menu overflow” troubleshooting steps in
InvestigatingResults.md.
Show a summary per file
| File | Description |
|---|---|
| eng/skill-validator/src/docs/InvestigatingResults.md | Adds guidance for diagnosing plugin-only non-activation caused by Copilot CLI skill-menu budget overflow and suggests mitigations. |
| eng/skill-validator/src/Check/SkillProfiler.cs | Lowers the aggregate description cap to 15,000 and documents it as a Copilot CLI budget constraint. |
| eng/skill-validator/src/Check/CheckCommand.cs | Excludes disable-model-invocation: true skills from the aggregate description calculation during plugin checks. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 3/3 changed files
- Comments generated: 2
|
@Evangelink : Can we re-create this from a branch in the repo please? |
…ion check Address review: replace Regex.IsMatch(pattern-string) with a [GeneratedRegex] partial method (AOT-friendly, no per-call cache lookup), matching FrontmatterParser's style. Runs once per skill during checks. Co-authored-by: Copilot <[email protected]>
|
👋 @Evangelink — this PR has 2 unresolved review thread(s). When you're ready, please address the feedback and push an update; the triage bot will pick up the next state automatically. (Add the |
Skill Validation Results
[1] (Isolated) Quality improved but weighted score is -2.1% due to: tokens (65782 → 85343), tool calls (5 → 6) Model: claude-opus-4.6 | Judge: claude-opus-4.6 🔍 Full Results - additional metrics and failure investigation steps
▶ Sessions Visualisation -- interactive replay of all evaluation sessions |
|
✅ Approved by @AbhitejJohn. cc @dotnet/skills-merge-approvers — ready to merge. |
…ck-scalar false positives The regex-based check matched any line in the frontmatter, so a block-scalar description that merely mentioned 'disable-model-invocation: true' on its own line was wrongly treated as disabling model invocation. Parse the frontmatter with the existing YAML deserializer (which correctly handles block scalars) by adding a DisableModelInvocation field to SkillFrontmatter, and drop the regex entirely. Co-authored-by: Copilot <[email protected]>
|
/evaluate |
Why
The skill-validator''s per-plugin aggregate description cap (
SkillProfiler.MaxAggregateDescriptionLength) had been raised 15,000 → 20,000 → 22,000, justified by a code comment asserting that 15K was "a local repo policy, NOT a documented Copilot/agentskills constraint."That assertion is wrong. The GitHub Copilot CLI renders the model-facing
<available_skills>menu under a hard 15,000-character budget (the agent SDK''sSKILL_CHAR_BUDGET, default15e3— confirmed in CLI 1.0.36 and 1.0.61). Skills are listed alphabetically by name and emitted with their full<description>only until the budget is exhausted; every skill past the cut-off collapses to a bare name with no description and can no longer be reliably model-activated.Raising the validator cap didn''t add headroom — it masked silent menu truncation. This is the root cause behind the
dotnet-testplugin-arm activation failures (e.g.run-tests,test-*): they sit alphabetically late, fell into the name-only overflow, and never activated in plugin eval runs even though they activate fine in isolation. Description tuning can''t fix that — the description is never shown.What
SkillProfiler.MaxAggregateDescriptionLength: 22,000 → 15,000, with the comment rewritten to document the real Copilot CLI budget (and correct the prior claim).disable-model-invocation: trueskills. The CLI drops those from the menu entirely, so they don''t consume the budget. This makes the cap satisfiable by hiding reference / agent-orchestrated primitives rather than only by trimming descriptions.InvestigatingResults.md: documents plugin-arm-only non-activation caused by skill-menu budget overflow, and how to fix it.dotnet-testcurrently aggregates ~20.7K chars (the only plugin over 15K), soskill-checkwill fail for it until it is slimmed below the cap — viadisable-model-invocationon reference/primitive skills (see #800) plus description trims. This PR should merge oncedotnet-testis ≤ 15K visible. All other plugins are already under the cap (next largest:dotnet-msbuildat ~14.5K).Verification
skill-validatorbuilds clean (0 warnings).disable-model-invocationskills are excluded from the aggregate (flagging two reference skills dropped the reported total by exactly their description lengths).