-
Notifications
You must be signed in to change notification settings - Fork 525
Add Copilot-driven test coverage agent #4853
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Copilot
wants to merge
7
commits into
main
Choose a base branch
from
copilot/add-copilot-test-coverage-agent
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
a4f7acf
Add Copilot-driven test coverage agent (4-file system)
Copilot 2322c5b
Update coverage tooling to use c8 instead of nyc and adjust related c…
924bc03
updated lock file
bbbf730
enhance test coverage protocols and scripts for Copilot integration
0d5dfa7
switch to use spawnSync for gh CLI commands and improve error handling
cd1dc0a
Move copilot-test-coverage to .github/instructions/ and update refere…
hanniavalera 9f84254
Add explicit PR template instructions in issue body and agent protocol
Copilot File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,33 @@ | ||
| --- | ||
| name: Coverage improvement | ||
| about: PR opened by the Copilot coverage agent | ||
| --- | ||
|
|
||
| ## Coverage improvement | ||
|
|
||
| ### What this PR covers | ||
| <!-- List which files' coverage improved and by how much --> | ||
| | File | Before | After | Δ | | ||
| |------|--------|-------|---| | ||
| | | | | | | ||
|
|
||
| ### Closes | ||
| <!-- Link the coverage issue: Closes #NNN --> | ||
|
|
||
| ### Self-audit results | ||
|
|
||
| - [ ] `yarn backendTests` passes | ||
| - [ ] Every file listed in the issue improved by ≥ 10 percentage points OR reached ≥ 60% line coverage | ||
| - [ ] No test uses `assert.ok(true)` or is an empty stub | ||
| - [ ] Test names describe behavior, not implementation | ||
| - [ ] No test depends on another test's side effects | ||
| - [ ] Presets-mode and kits/variants mode both exercised where relevant | ||
| - [ ] Single-config and multi-config generator paths both tested where relevant | ||
| - [ ] `yarn lint` passes | ||
| - [ ] `CHANGELOG.md` has an entry under `Improvements:` | ||
| - [ ] Tests are in `test/unit-tests/backend/` (runnable without VS Code host) | ||
|
|
||
| ### Coverage delta (paste `c8` text report here) | ||
| ``` | ||
| (paste output of: npx c8 report --reporter=text) | ||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
77 changes: 77 additions & 0 deletions
77
.github/instructions/copilot-test-coverage.instructions.md
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,77 @@ | ||
| --- | ||
| description: "Instructions for the Copilot coding agent when working on test-coverage issues." | ||
| applyTo: "test/unit-tests/**/*.ts,src/**/*.ts" | ||
|
hanniavalera marked this conversation as resolved.
|
||
| --- | ||
|
|
||
| # Test Coverage Agent — Self-Audit Protocol | ||
|
|
||
| You are improving test coverage for `microsoft/vscode-cmake-tools`. | ||
| This file contains mandatory protocol. Read all of it before writing code. | ||
|
|
||
| ## Step 1 — Orient before writing | ||
|
|
||
| - Read every source file you will test in full | ||
| - Read existing tests for that module (if any) in `test/unit-tests/backend/` | ||
| - Identify every exported function, class, and branch condition | ||
| - Do **not** write tests for private implementation details — test the public API | ||
| - If a source file deeply depends on `vscode` APIs, skip it — note in the PR that it needs integration-test coverage | ||
|
|
||
| ## Step 2 — Write real tests, not stubs | ||
|
|
||
| Every test must: | ||
| - Have a descriptive name: `'expandString handles undefined variable in preset context'` | ||
| - Assert exactly one logical behavior per `test()` block | ||
| - Not depend on side effects from another test | ||
| - Use `assert.strictEqual` / `assert.deepStrictEqual` over loose equality | ||
| - For async code: `await` and assert the resolved value — never `.then()` | ||
|
|
||
| ## Step 3 — Run the full self-audit before opening the PR | ||
|
|
||
| ```bash | ||
| # 1. Type-check test files | ||
| npx tsc -p test.tsconfig.json --noEmit | ||
|
|
||
| # 2. Lint | ||
| yarn lint | ||
|
|
||
| # 3. Run backend tests (this is the primary validation step) | ||
| yarn backendTests | ||
|
|
||
| # 4. Confirm coverage improved for the specific file | ||
| npx c8 --all --reporter=text --src=src \ | ||
| node ./node_modules/mocha/bin/_mocha \ | ||
| -u tdd --timeout 999999 --colors \ | ||
| -r ts-node/register \ | ||
| -r tsconfig-paths/register \ | ||
| "./test/unit-tests/backend/**/*.test.ts" | ||
| ``` | ||
|
|
||
| All steps must pass. If any fail, fix the failures before opening the PR. | ||
|
|
||
| > **Note:** `yarn unitTests` requires a VS Code extension host and a display server. | ||
| > It cannot be run in headless agent environments. The CI workflow validates those | ||
| > separately — you only need to run `yarn backendTests` and `yarn lint` locally. | ||
|
|
||
| ## Step 4 — Coverage bar | ||
|
|
||
| Do **not** open the PR unless every file listed in the issue has either: | ||
| - improved by ≥ 10 percentage points from the baseline in the issue, **OR** | ||
| - reached ≥ 60% line coverage | ||
|
|
||
| ## Test quality rules specific to this repo | ||
|
|
||
| | Rule | Why | | ||
| |------|-----| | ||
| | Test both `useCMakePresets` branches where the source branches on it | Most bugs affect only one mode | | ||
| | Test both single-config and multi-config generator paths where relevant | `CMAKE_BUILD_TYPE` vs `--config` are frequent bug sources | | ||
| | For `src/expand.ts` changes: test every macro type | `copilot-instructions.md` explicitly mandates this | | ||
| | For `src/diagnostics/`: test each compiler family's parser | `diagnostics.test.ts` is the largest test file — keep it comprehensive | | ||
| | Use `@cmt/` path alias for imports from `src/` | Never use relative paths from outside `src/` | | ||
| | Never use `console.log` in test files | Use the module logger or plain `assert` | | ||
|
|
||
| ## PR requirements | ||
|
|
||
| - Branch name: `coverage/<module-name>-tests` | ||
| - Open as **ready for review** only after the self-audit checklist in the issue is fully checked | ||
| - **PR description must use the coverage template**: copy the contents of `.github/PULL_REQUEST_TEMPLATE/coverage.md` into the PR body (or append `?template=coverage.md` to the PR creation URL). Fill in the coverage-delta table and check every self-audit box. | ||
| - `CHANGELOG.md` must have one entry under `Improvements:` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,217 @@ | ||
| #!/usr/bin/env node | ||
| // @ts-check | ||
| import { readFileSync, writeFileSync, unlinkSync } from 'fs'; | ||
| import { spawnSync } from 'child_process'; | ||
| import { relative, resolve, join } from 'path'; | ||
| import { tmpdir } from 'os'; | ||
|
|
||
| const THRESHOLD = Number(process.env.THRESHOLD ?? 60); | ||
| const SUMMARY_PATH = 'coverage/coverage-summary.json'; | ||
| const REPO = process.env.GITHUB_REPOSITORY; // e.g. "microsoft/vscode-cmake-tools" | ||
|
|
||
| if (!REPO) { | ||
| console.error('GITHUB_REPOSITORY is not set — are you running outside GitHub Actions?'); | ||
| process.exit(1); | ||
| } | ||
|
|
||
| const RUN_URL = `https://github.com/${REPO}/actions/runs/${process.env.GITHUB_RUN_ID}`; | ||
|
|
||
| // ── 1. Read Istanbul JSON summary ──────────────────────────────────────────── | ||
| let summary; | ||
| try { | ||
| summary = JSON.parse(readFileSync(SUMMARY_PATH, 'utf8')); | ||
| } catch { | ||
| console.log('No coverage summary found — skipping issue creation.'); | ||
| process.exit(0); | ||
| } | ||
|
|
||
| // ── 2. Find files below threshold ──────────────────────────────────────────── | ||
| const belowThreshold = []; | ||
| const repoRoot = resolve('.'); | ||
|
|
||
| for (const [file, metrics] of Object.entries(summary)) { | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Have we tested this locally to validate that it works? |
||
| if (file === 'total') continue; | ||
|
|
||
| // Normalize: c8/Istanbul may emit absolute paths or repo-relative paths | ||
| const relPath = file.startsWith('/') || /^[A-Za-z]:[/\\]/.test(file) | ||
| ? relative(repoRoot, file).replace(/\\/g, '/') | ||
| : file; | ||
|
|
||
| if (!relPath.startsWith('src/')) continue; | ||
|
|
||
| const linePct = metrics.lines.pct ?? 0; | ||
| const branchPct = metrics.branches.pct ?? 0; | ||
| const fnPct = metrics.functions.pct ?? 0; | ||
|
|
||
| if (linePct < THRESHOLD || fnPct < THRESHOLD) { | ||
| belowThreshold.push({ file: relPath, linePct, branchPct, fnPct }); | ||
| } | ||
| } | ||
|
|
||
| const total = summary.total ?? {}; | ||
| const totalLines = total.lines?.pct ?? 0; | ||
|
|
||
| console.log(`\nTotal line coverage: ${totalLines}% (threshold: ${THRESHOLD}%)`); | ||
| console.log(`Files below threshold: ${belowThreshold.length}`); | ||
|
|
||
| if (belowThreshold.length === 0 && totalLines >= THRESHOLD) { | ||
| console.log('✅ Coverage is above threshold — no issue needed.'); | ||
| process.exit(0); | ||
| } | ||
|
|
||
| // ── 3. Check for existing open coverage issue (avoid duplicates) ───────────── | ||
| let existingIssueNumber = null; | ||
| try { | ||
| const result = spawnSync('gh', [ | ||
| 'issue', 'list', | ||
| '--repo', REPO, | ||
| '--label', 'test-coverage', | ||
| '--state', 'open', | ||
| '--json', 'number', | ||
| '--jq', '.[0].number' | ||
| ], { encoding: 'utf8' }); | ||
| if (result.status === 0 && result.stdout.trim()) { | ||
| existingIssueNumber = result.stdout.trim(); | ||
| } | ||
| } catch { | ||
| // gh CLI may fail if label doesn't exist yet — continue to create | ||
| } | ||
|
|
||
| // ── 4. Build the issue body (this is the Copilot agent's instruction set) ──── | ||
| belowThreshold.sort((a, b) => a.linePct - b.linePct); // worst files first | ||
|
|
||
| const tableRows = belowThreshold | ||
| .slice(0, 20) // cap at 20 files per issue to keep it focused | ||
| .map(f => `| \`${f.file}\` | ${f.linePct}% | ${f.branchPct}% | ${f.fnPct}% |`) | ||
| .join('\n'); | ||
|
|
||
| const fileList = belowThreshold | ||
| .slice(0, 20) | ||
| .map(f => `- \`${f.file}\` — ${f.linePct}% line coverage`) | ||
| .join('\n'); | ||
|
|
||
| const remainingCount = belowThreshold.length - Math.min(belowThreshold.length, 20); | ||
| const remainingNote = remainingCount > 0 | ||
| ? `\n\n> **${remainingCount} additional files** are also below threshold. They will appear here once the files above improve.` | ||
| : ''; | ||
|
|
||
| const issueBody = `\ | ||
| ## Coverage below ${THRESHOLD}% threshold | ||
|
|
||
| > **This issue is the instruction set for the GitHub Copilot coding agent.** | ||
| > Copilot: read this entire body before writing a single line of code. | ||
|
|
||
| **Coverage run:** ${RUN_URL} | ||
| **Total line coverage:** ${totalLines}% — threshold is ${THRESHOLD}% | ||
|
|
||
| ### Files requiring new tests | ||
|
|
||
| | File | Lines | Branches | Functions | | ||
| |------|-------|----------|-----------| | ||
| ${tableRows} | ||
|
|
||
| --- | ||
|
|
||
| ### Agent instructions | ||
|
|
||
| You are improving test coverage in \`microsoft/vscode-cmake-tools\`. | ||
| Read \`.github/instructions/copilot-test-coverage.instructions.md\` before starting — it contains the | ||
| mandatory self-audit protocol and test quality rules for this repo. | ||
|
|
||
| **Files to cover (worst first):** | ||
| ${fileList} | ||
|
|
||
| For each file: | ||
| 1. Read the source file fully before writing any test | ||
| 2. Identify the module's exported API surface | ||
| 3. Write tests in \`test/unit-tests/backend/\` that cover the uncovered branches | ||
| 4. Run the self-audit steps from \`.github/instructions/copilot-test-coverage.instructions.md\` | ||
| 5. Only open the PR after every self-audit step passes | ||
|
|
||
| ### Self-audit checklist (must be checked before opening PR) | ||
|
|
||
| - [ ] \`yarn backendTests\` passes with no new failures | ||
| - [ ] Each file listed above improved by ≥ 10 percentage points OR reached ≥ ${THRESHOLD}% line coverage | ||
| - [ ] No test uses \`assert.ok(true)\` or is an empty stub | ||
| - [ ] Test names describe behavior: \`'expandString handles undefined variable'\` not \`'test 1'\` | ||
| - [ ] No test depends on another test's side effects | ||
| - [ ] Presets-mode and kits/variants mode both exercised where the source branches on \`useCMakePresets\` | ||
| - [ ] Single-config and multi-config generator paths both tested where relevant | ||
| - [ ] \`yarn lint\` passes | ||
| - [ ] \`CHANGELOG.md\` has an entry under \`Improvements:\` | ||
|
|
||
| ### Scope and testability | ||
|
|
||
| Coverage is collected only from \`yarn backendTests\` (\`test/unit-tests/backend/\`). | ||
| New tests **must** go in \`test/unit-tests/backend/\` and run in plain Node.js (no VS Code host). | ||
|
|
||
| - If a \`src/\` file can be imported directly (no \`vscode\` dependency at import time), write backend tests for it. | ||
| - If a file deeply depends on \`vscode\` APIs (e.g., \`extension.ts\`, \`projectController.ts\`, UI modules), **skip it** — add a comment in the PR noting it needs integration-test coverage instead. | ||
| - Pure logic modules (\`expand.ts\`, \`shlex.ts\`, \`cache.ts\`, \`diagnostics/\`, \`preset.ts\`) are ideal targets. | ||
| ${remainingNote} | ||
|
|
||
| ### PR template | ||
|
|
||
| When opening the pull request, **you must use the coverage PR template**. | ||
| Include \`?template=coverage.md\` in the PR creation URL, or copy the contents of | ||
| \`.github/PULL_REQUEST_TEMPLATE/coverage.md\` into the PR description and fill in | ||
| the coverage table and checklist. | ||
|
|
||
| ### Constraints | ||
|
|
||
| - Tests go in \`test/unit-tests/backend/\` — use Mocha \`suite\`/\`test\` with \`assert\` | ||
| - Validate with \`yarn backendTests\`, not \`yarn unitTests\` (which requires a VS Code host) | ||
| - Import source under test via the \`@cmt/\` path alias | ||
| - Do **not** open the PR as a draft if the self-audit fails — fix it first | ||
| - Do **not** touch source files outside \`test/\` | ||
| `; | ||
|
|
||
| // ── 5. Ensure the test-coverage label exists ───────────────────────────────── | ||
| { | ||
| const result = spawnSync('gh', [ | ||
| 'label', 'create', 'test-coverage', | ||
| '--repo', REPO, | ||
| '--color', '0075ca', | ||
| '--description', 'Opened by the coverage agent', | ||
| '--force' | ||
| ], { stdio: 'inherit' }); | ||
| // Non-zero is acceptable here — label may already exist without --force support | ||
| if (result.error) { | ||
| console.warn('Warning: could not ensure test-coverage label exists:', result.error.message); | ||
| } | ||
| } | ||
|
|
||
| // ── 6. Create or update the coverage issue ─────────────────────────────────── | ||
| const title = `Test coverage below ${THRESHOLD}% threshold — ${belowThreshold.length} files need tests (${new Date().toISOString().slice(0, 10)})`; | ||
|
|
||
| // Write body to a temp file to avoid shell injection via file paths in the body | ||
| const bodyFile = join(tmpdir(), `coverage-issue-body-${Date.now()}.md`); | ||
| writeFileSync(bodyFile, issueBody, 'utf8'); | ||
|
|
||
| try { | ||
| let result; | ||
| if (existingIssueNumber) { | ||
| console.log(`\nUpdating existing issue #${existingIssueNumber}`); | ||
| result = spawnSync('gh', [ | ||
| 'issue', 'edit', existingIssueNumber, | ||
| '--repo', REPO, | ||
| '--title', title, | ||
| '--body-file', bodyFile | ||
| ], { stdio: 'inherit' }); | ||
| } else { | ||
| console.log(`\nOpening issue: ${title}`); | ||
| result = spawnSync('gh', [ | ||
| 'issue', 'create', | ||
| '--repo', REPO, | ||
| '--title', title, | ||
| '--body-file', bodyFile, | ||
| '--label', 'test-coverage' | ||
| ], { stdio: 'inherit' }); | ||
| } | ||
| if (result.status !== 0) { | ||
| console.error(`gh command failed with exit code ${result.status}`); | ||
| process.exit(result.status ?? 1); | ||
| } | ||
| } finally { | ||
| try { unlinkSync(bodyFile); } catch { /* cleanup best-effort */ } | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any specific thing that is supposed to read this? Copilot will grab it when it searches for files, but from what I can tell, it doesn't match any filename format that copilot cli inherently reads, so I'm not sure it will get reliably run.
Is there something I'm missing here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see! Thanks for taking a look at this :) The template isn't auto-applied by GitHub or Copilot, it has to be explicitly referenced. The coding agent is instructed to use it in
.github/instructions/copilot-test-coverage.instructions.md, but relying on the agent consistently reading and following that is fragile. I'll either rename it toPULL_REQUEST_TEMPLATE.mdso it becomes the default single template, or add an explicit instruction in the issue body itself directing the agent to use the?template=coverage.mdURL when opening the PR. Leaning toward the latter since the first would affect all PRs.