skill-validator: restore 15K aggregate cap as the real Copilot CLI skill-menu budget by Evangelink · Pull Request #803 · dotnet/skills

Evangelink · 2026-06-22T15:37:06Z

Why

The skill-validator''s per-plugin aggregate description cap (SkillProfiler.MaxAggregateDescriptionLength) had been raised 15,000 → 20,000 → 22,000, justified by a code comment asserting that 15K was "a local repo policy, NOT a documented Copilot/agentskills constraint."

That assertion is wrong. The GitHub Copilot CLI renders the model-facing <available_skills> menu under a hard 15,000-character budget (the agent SDK''s SKILL_CHAR_BUDGET, default 15e3 — confirmed in CLI 1.0.36 and 1.0.61). Skills are listed alphabetically by name and emitted with their full <description> only until the budget is exhausted; every skill past the cut-off collapses to a bare name with no description and can no longer be reliably model-activated.

Raising the validator cap didn''t add headroom — it masked silent menu truncation. This is the root cause behind the dotnet-test plugin-arm activation failures (e.g. run-tests, test-*): they sit alphabetically late, fell into the name-only overflow, and never activated in plugin eval runs even though they activate fine in isolation. Description tuning can''t fix that — the description is never shown.

What

SkillProfiler.MaxAggregateDescriptionLength: 22,000 → 15,000, with the comment rewritten to document the real Copilot CLI budget (and correct the prior claim).
Aggregate now excludes disable-model-invocation: true skills. The CLI drops those from the menu entirely, so they don''t consume the budget. This makes the cap satisfiable by hiding reference / agent-orchestrated primitives rather than only by trimming descriptions.
InvestigatingResults.md: documents plugin-arm-only non-activation caused by skill-menu budget overflow, and how to fix it.

⚠️ Sequencing

dotnet-test currently aggregates ~20.7K chars (the only plugin over 15K), so skill-check will fail for it until it is slimmed below the cap — via disable-model-invocation on reference/primitive skills (see #800) plus description trims. This PR should merge once dotnet-test is ≤ 15K visible. All other plugins are already under the cap (next largest: dotnet-msbuild at ~14.5K).

Verification

skill-validator builds clean (0 warnings).
Confirmed the cap is enforced and that disable-model-invocation skills are excluded from the aggregate (flagging two reference skills dropped the reported total by exactly their description lengths).

…opilot CLI skill-menu budget The per-plugin aggregate description cap had been raised 15,000 -> 20,000 -> 22,000 under the belief that 15K was 'a local repo policy, NOT a documented Copilot constraint'. That belief was wrong: the GitHub Copilot CLI renders the model-facing <available_skills> menu under a hard 15,000- char budget (the agent SDK's SKILL_CHAR_BUDGET, default 15e3, confirmed in CLI 1.0.36 and 1.0.61). Skills are listed alphabetically and emitted with their full <description> only until the budget is exhausted; every skill past the cut-off collapses to a bare name with no description and can no longer be reliably model-activated. Raising the validator cap merely masked this silent menu truncation — e.g. dotnet-test's run-tests and test-* skills stopped activating in plugin eval runs because they fell into the name-only overflow. Changes: - SkillProfiler.MaxAggregateDescriptionLength: 22,000 -> 15,000, with the comment rewritten to document the real Copilot CLI budget (and correct the prior 'not a documented constraint' claim). - CheckCommand aggregate now excludes skills marked 'disable-model-invocation: true' — the CLI drops those from the menu, so they do not consume the budget. This makes the cap satisfiable by hiding reference / agent-orchestrated primitives rather than only by trimming. - InvestigatingResults.md: document plugin-arm-only non-activation caused by skill-menu budget overflow, and how to fix it. Note: dotnet-test currently exceeds 15K and must be slimmed below it (via disable-model-invocation on reference/primitive skills plus description trims) before this cap can go green repo-wide. Co-authored-by: Copilot <[email protected]>

github-actions · 2026-06-22T15:37:17Z

Note

This PR is from a fork and modifies infrastructure files (eng/ or .github/).

Changes to infrastructure typically need to be submitted from a branch in dotnet/skills (not a fork) so that CI workflows run with the correct permissions and secrets.

Please consider recreating this PR from an upstream branch. If you don't have push access to dotnet/skills, ask a maintainer to push your branch for you.

Copilot

Pull request overview

Aligns skill-validator’s per-plugin aggregate description cap with the Copilot CLI’s effective 15,000-character skill-menu budget, and updates validation/docs to prevent silent <available_skills> truncation from masking plugin-arm non-activation.

Changes:

Restores SkillProfiler.MaxAggregateDescriptionLength to 15,000 and rewrites the rationale/commentary to reflect the Copilot CLI menu budget behavior.
Updates check to exclude skills with disable-model-invocation: true from the aggregate description total (matching CLI menu behavior).
Documents “plugin-arm-only non-activation due to menu overflow” troubleshooting steps in InvestigatingResults.md.

Show a summary per file

File	Description
eng/skill-validator/src/docs/InvestigatingResults.md	Adds guidance for diagnosing plugin-only non-activation caused by Copilot CLI skill-menu budget overflow and suggests mitigations.
eng/skill-validator/src/Check/SkillProfiler.cs	Lowers the aggregate description cap to 15,000 and documents it as a Copilot CLI budget constraint.
eng/skill-validator/src/Check/CheckCommand.cs	Excludes `disable-model-invocation: true` skills from the aggregate description calculation during plugin checks.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 3/3 changed files
Comments generated: 2

AbhitejJohn · 2026-06-22T16:16:25Z

@Evangelink : Can we re-create this from a branch in the repo please?

…ion check Address review: replace Regex.IsMatch(pattern-string) with a [GeneratedRegex] partial method (AOT-friendly, no per-call cache lookup), matching FrontmatterParser's style. Runs once per skill during checks. Co-authored-by: Copilot <[email protected]>

github-actions · 2026-06-22T17:00:01Z

👋 @Evangelink — this PR has 2 unresolved review thread(s). When you're ready, please address the feedback and push an update; the triage bot will pick up the next state automatically. (Add the no-stale label to silence further pings.)

github-actions · 2026-06-22T18:44:19Z

Skill Validation Results

Skill	Scenario	Quality	Skills Loaded	Overfit	Verdict
system-text-json-net11	Serialize JSON in .NET 11 with PascalCase property names	4.0/5 → 5.0/5 🟢	✅ system-text-json-net11; tools: skill	✅ 0.06	❌ [1]
system-text-json-net11	Type-safe JsonTypeInfo access without exceptions in .NET 11	3.0/5 → 5.0/5 🟢	✅ system-text-json-net11; tools: skill, edit, view	✅ 0.06	✅
system-text-json-net11	Non-activation: camelCase JSON serialization on .NET 8	5.0/5 → 5.0/5	ℹ️ not activated (expected)	✅ 0.06	❌ [2]
optimizing-ef-core-queries	Optimize bulk operations with EF Core 7+ ExecuteUpdate and ExecuteDelete	5.0/5 → 5.0/5	⚠️ NOT ACTIVATED	🟡 0.22	❌

[1] (Isolated) Quality improved but weighted score is -2.1% due to: tokens (65782 → 85343), tool calls (5 → 6)
[2] (Isolated) Quality unchanged but weighted score is -16.2% due to: judgment, tokens (51994 → 80828), tool calls (4 → 6)

Model: claude-opus-4.6 | Judge: claude-opus-4.6

🔍 Full Results - additional metrics and failure investigation steps

To investigate failures, paste this to your AI coding agent:

For PR 803 in dotnet/skills, download eval artifacts with gh run download 27975446091 --repo dotnet/skills --pattern "skill-validator-results-*" --dir ./eval-results, then fetch https://raw.githubusercontent.com/dotnet/skills/78e9dda103845334f0ab7d390467ad30e744f360/eng/skill-validator/src/docs/InvestigatingResults.md and follow it to analyze the results.json files. Diagnose each failure, suggest fixes to the eval.yaml and skill content, and tell me what to fix first.

▶ Sessions Visualisation -- interactive replay of all evaluation sessions
📊 Session Analytics (preview) -- aggregated metrics across evaluation sessions

github-actions · 2026-06-22T22:00:33Z

✅ Approved by @AbhitejJohn. cc @dotnet/skills-merge-approvers — ready to merge.

Copilot

Copilot's findings

Files reviewed: 3/3 changed files
Comments generated: 1

…ck-scalar false positives The regex-based check matched any line in the frontmatter, so a block-scalar description that merely mentioned 'disable-model-invocation: true' on its own line was wrongly treated as disabling model invocation. Parse the frontmatter with the existing YAML deserializer (which correctly handles block scalars) by adding a DisableModelInvocation field to SkillFrontmatter, and drop the regex entirely. Co-authored-by: Copilot <[email protected]>

Evangelink · 2026-06-23T09:44:17Z

/evaluate

Copilot AI review requested due to automatic review settings June 22, 2026 15:37

Evangelink requested review from JanKrivanek and ViktorHofer as code owners June 22, 2026 15:37

Copilot started reviewing on behalf of Evangelink June 22, 2026 15:37 View session

Copilot AI reviewed Jun 22, 2026

View reviewed changes

Comment thread eng/skill-validator/src/Check/CheckCommand.cs

Comment thread eng/skill-validator/src/Check/SkillProfiler.cs

github-actions Bot added the waiting-on-author PR state label label Jun 22, 2026

github-actions Bot added pr-state/ready-for-eval PR is mergeable and awaiting evaluation and removed waiting-on-author PR state label labels Jun 22, 2026

github-actions Bot added a commit that referenced this pull request Jun 22, 2026

Update PR token usage data (PR #803)

4a6cb0f

AbhitejJohn approved these changes Jun 22, 2026

View reviewed changes

github-actions Bot added ready-to-merge PR state label and removed pr-state/ready-for-eval PR is mergeable and awaiting evaluation labels Jun 22, 2026

Merge branch 'main' into restore-15k-skill-budget

bbc43d3

Copilot AI review requested due to automatic review settings June 23, 2026 06:59

Copilot started reviewing on behalf of Evangelink June 23, 2026 07:00 View session

Copilot AI reviewed Jun 23, 2026

View reviewed changes

Comment thread eng/skill-validator/src/Check/CheckCommand.cs Outdated

github-actions Bot added waiting-on-author PR state label and removed ready-to-merge PR state label labels Jun 23, 2026

Evangelink enabled auto-merge (squash) June 23, 2026 10:11

YuliiaKovalova approved these changes Jun 23, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

skill-validator: restore 15K aggregate cap as the real Copilot CLI skill-menu budget#803

skill-validator: restore 15K aggregate cap as the real Copilot CLI skill-menu budget#803
Evangelink wants to merge 4 commits into
dotnet:mainfrom
Evangelink:restore-15k-skill-budget

Evangelink commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

AbhitejJohn commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Evangelink commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

Evangelink commented Jun 22, 2026

Why

What

⚠️ Sequencing

Verification

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Uh oh!

Uh oh!

AbhitejJohn commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

github-actions Bot commented Jun 22, 2026

Skill Validation Results

Uh oh!

github-actions Bot commented Jun 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Copilot's findings

Uh oh!

Uh oh!

Evangelink commented Jun 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants