Add extract-to-cache build mode for the Copilot CLI#1450
Conversation
Splits CLI provisioning into two modes selected by the existing
`bundled-cli` cargo feature:
- `bundled-cli` on (default, release): unchanged. build.rs downloads,
extracts on first use via embeddedcli::install_at.
- `bundled-cli` off (dev): build.rs downloads, SHA-verifies, and
extracts the binary directly into <cache>/github-copilot-sdk/cli/
<version>/ (staging dir + atomic rename). Emits a build-time
COPILOT_CLI_DEV_PATH env var that resolve.rs returns directly.
Replaces the three-source version resolution chain in build.rs
(COPILOT_CLI_VERSION env, bundled_cli_version.txt snapshot, lockfile
fallback) with a two-source chain:
- cli-version.txt at the crate root (published crates, vendored slots).
- ../nodejs/package-lock.json (mono-repo contributor build) — matches
the .NET _GetCopilotCliVersion MSBuild target and Go cmd/bundler
conventions.
Cache layout flips from <cache>/github-copilot-sdk-{version}/ to the
shared <cache>/github-copilot-sdk/cli/<version>/ so both build modes
populate the same directory. Old per-user caches become orphaned
(harmless; the next launch re-populates the new path).
COPILOT_CLI_PATH env override still wins in both modes. Stale env
override falls through to the next source with a warn! log.
Drops:
- rust/scripts/snapshot-bundled-cli-version.sh (replaced by an inline
node one-liner in the publish workflow that writes cli-version.txt
from nodejs/package-lock.json before cargo publish).
- COPILOT_CLI_VERSION env override path.
- bundled_cli_version.txt snapshot path + parse_snapshot.
- cargo:rerun-if-env-changed=COPILOT_CLI_VERSION.
Adds tests/cli_resolution_test.rs covering pin-file format, env
override (both modes), stale env override fallthrough, and dev-mode
extracted binary path. The existing rust-sdk-tests CI job runs with
--no-default-features so dev-mode resolution is exercised on every PR;
the bundle job exercises embed mode. Both jobs now share the
BUNDLED_CLI_CACHE_DIR archive cache to avoid double-downloading.
Co-authored-by: Copilot <[email protected]>
Term isn't established in this repo's prose. The fallback path that reads ../nodejs/package-lock.json fires for contributor builds inside github/copilot-sdk itself; spelling it out is clearer than introducing a new term. Co-authored-by: Copilot <[email protected]>
Without a committed SHA snapshot, build.rs would fetch both `SHA256SUMS.txt` and the archive from the same release URL on every consumer build. An attacker who later compromises a release tag could swap both files together and the hash check would pass against their own sum — the verification would only protect against transit corruption, not against upstream tampering. Restore the snapshot model: - New `rust/scripts/write-cli-shas.sh` writes `cli-shas.txt` with one `<asset>=<hex sha>` line per supported platform, fetched from the release matching `cli-version.txt`. Gitignored locally. - `build.rs::read_expected_sha` prefers the snapshot when present; falls back to live `SHA256SUMS.txt` only when absent (contributor builds inside github/copilot-sdk). - Publish workflow runs `write-cli-shas.sh` after writing cli-version.txt and before `cargo publish`, so the published crate carries hashes captured at publish time. Same flow as the old snapshot-bundled-cli-version.sh, just split across one-line pin file + sibling SHAs file. - `Cargo.toml` includes `cli-shas.txt` in the crate package. Verified locally: - Snapshot path: build with both `cli-version.txt` + `cli-shas.txt` present — skips the live SHA fetch, downloads + verifies against the committed hash. - Tamper detection: corrupting a hash in `cli-shas.txt` causes the build to panic with the existing "could indicate a supply-chain attack" message. - Fallback path: with neither pin file present, build.rs reads version from `../nodejs/package-lock.json` and fetches SHA256SUMS.txt live — unchanged from the pre-snapshot contributor flow. Co-authored-by: Copilot <[email protected]>
Reduces churn by going back to one combined snapshot file: - File: cli-version.txt (matches the design-doc name) holds both the version line and per-platform SHA-256 hashes. Same combined format the proven bundled_cli_version.txt used. - Script: snapshot-bundled-cli-version.sh restored byte-for-byte from history except for the OUTPUT filename. Same logic, same name, same output structure. - build.rs: restored the proven resolve_version_and_hash + parse_snapshot pair (minus the dropped COPILOT_CLI_VERSION env path). Drops the cli-shas.txt / read_expected_sha / parse_sha_snapshot helpers I introduced in the previous commit — those split the snapshot into two files for no real benefit. - Publish workflow: one step calling the script, like before. Cargo.toml include[] and .gitignore drop cli-shas.txt. Test for pin format updated to validate the combined lines. Verified locally: cargo check passes in both feature modes with the snapshot present, cargo test --no-default-features passes (4/4). Co-authored-by: Copilot <[email protected]>
Self-review (GPT 5.5) catch:
- CliProgram::Resolve docstring still said 'PATH + common locations' for
the third source; correct it to 'dev cache'.
- ClientOptions docstring still implied bundled-cli was the only
auto-resolve path; spell out the env var + dev-cache fallback.
- bundled_cli_extract_dir field doc had the old hyphenated cache path
(github-copilot-sdk-{version}); update to the shared
github-copilot-sdk/cli/<version>/ layout.
- bundled_cli_extract_dir + with_bundled_cli_extract_dir now explicitly
document that the option is ignored in default-features=false dev
builds (build.rs has already extracted; use CliProgram::Path or
COPILOT_CLI_PATH for an override).
Co-authored-by: Copilot <[email protected]>
Previous version staged a whole per-version directory and renamed it into place. fs::rename for files is atomic on Windows (MoveFileExW + MOVEFILE_REPLACE_EXISTING), but fs::rename for directories is NOT — if install_dir already exists (partial-extract recovery, stale install from a crashed build, etc.) Windows refuses the rename and we'd panic. Switch to file-level staging: - create_dir_all(install_dir) up front (idempotent, fine on Windows even if it exists with content). - Write extracted binary to a staging *file* alongside its final location (.copilot[.exe].staging-<pid>-<nanos>). - chmod 755 on Unix only. - fs::rename(staging_file, final_path) — atomic file-replace on both Unix and Windows. If a concurrent build won the race their bytes are the same (SHA-verified inputs), so replacement is safe. - On error, remove the staging file (a file, not remove_dir_all of a directory). Verified extraction still works end-to-end on macOS (dev-mode test passes, extracted copilot --version prints expected output). Windows release zip confirmed to be the same single-file layout (copilot.exe at root, no subdir) so the extraction code's name match applies cleanly there too. Windows CI already exercises both default-features and --no-default-features jobs. Co-authored-by: Copilot <[email protected]>
User caught this on review: embed mode's embeddedcli::install (which
already ships to Windows) doesn't stage. It does:
create_dir_all(install_dir)?
if final_path.is_file() { return Ok(final_path); }
write_binary(final_path, bytes)
That's it. My extract_to_cache invented a staging-file + atomic-rename
dance that doesn't exist in the proven path, then I 'discovered' my own
over-engineering didn't work on Windows (directory rename), 'fixed' it
with a file-level rename, and was still strictly more complex than what
already works.
Match the embed pattern exactly. Same idempotency check, same direct
write, same #[cfg(unix)] chmod gate. Race semantics across concurrent
build.rs invocations are identical to the embed install at runtime, and
both write SHA-verified-identical bytes. If two builds race, last write
wins with identical contents.
Verified end-to-end on macOS: dev-mode test passes, extracted
copilot --version runs.
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Pull request overview
This PR updates the Rust SDK’s Copilot CLI provisioning so that dev builds (default-features = false, i.e. bundled-cli off) download + verify the CLI at build time and extract it directly into the per-user cache, while release builds continue embedding the verified archive and lazily extracting at runtime.
Changes:
- Add dev-mode “extract-to-cache” provisioning in
build.rs, emittingCOPILOT_CLI_DEV_PATHfor runtime resolution. - Update Rust runtime CLI resolution order and docs to include the dev-cache path and new cache layout.
- Add Rust integration tests and CI cache plumbing for the new build behavior.
Show a summary per file
| File | Description |
|---|---|
| rust/tests/cli_resolution_test.rs | New integration tests covering env override, dev-cache extraction, and pin-file format. |
| rust/src/resolve.rs | Adds dev-cache resolution path when bundled-cli is disabled; updates docs/warnings. |
| rust/src/lib.rs | Gates embeddedcli on bundled-cli and updates public docs for resolution/extract-dir behavior. |
| rust/src/embeddedcli.rs | Updates docs + cache layout to github-copilot-sdk/cli/<version>/… and gates APIs on bundled-cli. |
| rust/scripts/snapshot-bundled-cli-version.sh | Renames snapshot output to cli-version.txt. |
| rust/README.md | Updates embedded/dev provisioning explanation, cache layout, and resolution priority docs. |
| rust/Cargo.toml | Renames packaged snapshot file to cli-version.txt; adds dirs build dependency. |
| rust/build.rs | Implements shared download+verify; branches into embed vs extract-to-cache modes; introduces dev-cache extraction. |
| rust/.gitignore | Ignores cli-version.txt instead of the old snapshot filename. |
| .github/workflows/rust-sdk-tests.yml | Adds CLI-version keying and shares the build.rs archive cache between test/bundle jobs. |
| .github/workflows/publish.yml | Updates publish verification step for cli-version.txt. |
Copilot's findings
- Files reviewed: 11/11 changed files
- Comments generated: 8
embeddedcli::default_install_dir runs sanitize_version on the version string before using it as a path component, replacing anything outside [a-zA-Z0-9._-] with _. extract_to_cache did not. For today's CLI versions (1.0.55-1) both resolve to identical paths because the version chars are already safe. But the invariant we documented — embed mode and dev mode share the same per-version cache directory — would silently break for any future version containing an unusual character: embed would write to <cache>/.../1.0.55_rc1/copilot while dev wrote to <cache>/.../1.0.55-rc1/copilot (etc). Duplicate sanitize_version into build.rs (small, can't share with the crate it's building) and note it's kept in sync with the embed-mode copy. No other behavior change. Co-authored-by: Copilot <[email protected]>
rust/tests/cli_resolution_test.rs (resolves 5 review threads):
- Replace `display.contains("/cli/")` with a Path::components() walk so
the assertion is portable across Windows backslashes. This was the
actual Windows CI failure (PR #1450 run 26493115251).
- Re-gate dev_mode_extracted_binary_exists on cfg(has_dev_cli) so it
compiles cleanly on unsupported target platforms where build.rs
doesn't emit COPILOT_CLI_DEV_PATH.
- Replace substring "BinaryNotFound" / "not bundled" check in
stale_env_override_falls_through with a variant match on
Error::BinaryNotFound (the Display output is lowercase, so the old
check passed even when fallthrough was broken).
- Rewrite the comment in env_override_resolves_to_pointed_file: the
test never chmods the tempfile and doesn't need to (resolver only
does is_file()).
- Switch env-touching tokio tests to flavor = "current_thread" and
rewrite the safety comment on set_env/unset_env to honestly describe
what serial + current_thread give us (and what they don't).
rust/src/resolve.rs:
- Drop the "matches the .NET and TypeScript SDKs" claim from the
module doc; this resolver now has an extra dev-cache step.
rust/README.md:
- Add a callout warning that default-features = false produces a build
that is *not* self-contained; only intended for local development.
Production / distributed builds should keep bundled-cli on or supply
COPILOT_CLI_PATH at runtime.
Validated: cargo test (both feature configs), clippy, nightly fmt,
RUSTDOCFLAGS="-D warnings" cargo doc.
Co-authored-by: Copilot <[email protected]>
Caught while updating github/copilot-sdk#1450's body: it said things like 'the original spec proposed live-fetching SHA256SUMS.txt' and 'first pass used directory-level staging'. The cardinal rule 'describe the final state, not the dev journey' already covered this in spirit, but the existing examples in Common mistakes all read like they're about branch-evolution moments (force-push, fixup, switched approach). The slightly different mode here was contrasting against an external spec doc / design doc — same readability problem, different mechanism. Add a focused bullet under Common mistakes that calls this out explicitly and gives the affirmative move (update the doc instead of litigating it in the PR body).
…e, staging+rename Drop 'extract-to-cache mode' phrasing; clarify caveats when bundled-cli is off Phrase the disabled-bundled-cli case in terms of the feature flag itself rather than coining a 'mode' name. Strengthen the README caveat: the consumer must supply a compatible CLI via CliProgram::Path / COPILOT_CLI_PATH and is responsible for version compatibility; the build-machine auto-resolution is a convenience that does not carry over to other machines. Co-Authored-By: Copilot <[email protected]>
|
Thanks @tclem! The overall concept here looks great. I'm proposing a few amendments that I think generalize this more beyond the GitHub Copilot App workflows, but hopefully without taking away anything you want to use in those workflows. For speed, I did that by pushing an extra commit to your branch. If you don't like it, we can undo that commit and discuss! OverviewThe following are small adjustments on top of what you already built. 1. New env var: COPILOT_SKIP_CLI_DOWNLOADSetting COPILOT_SKIP_CLI_DOWNLOAD=1 at build time makes build.rs return immediately — no download, no bundle, no cache write — regardless of the bundled-cli feature. Lets consumers who always supply the CLI at runtime (via CliProgram::Path / COPILOT_CLI_PATH) build fully offline. 2. Skip redundant downloads when bundled-cli is offIf the extracted binary is already at the conventional path, build.rs now skips cached_download entirely. The extracted binary is the cache — no separate archive cache to maintain. Previously each clean build re-fetched the archive before discovering it didn't need it. 3. New env var: COPILOT_CLI_EXTRACT_DIR (symmetric)When set, both build.rs (write side) and the runtime resolver (read side) place/look-for the binary directly under that directory (no per-version subdir). Recommended consumer pattern is .cargo/config.toml [env] with relative = true, force = true so cargo applies the same value to build and to cargo run/cargo test. Gives consumers a way to relocate the extraction symmetrically (CI runners with ephemeral homes, vendored slots, etc.). 4. Bake only the version, not an absolute pathbuild.rs now always emits cargo:rustc-env=COPILOT_SDK_CLI_VERSION=. Both modes consume it. With bundled-cli off, the runtime resolver recomputes the path from COPILOT_SDK_CLI_VERSION + cfg!(windows) + optional COPILOT_CLI_EXTRACT_DIR — the same convention build.rs uses on the write side. Previously the absolute extracted path was baked into the rlib via env!("COPILOT_CLI_DEV_PATH"). That leaks $HOME/$LOCALAPPDATA into compiled artifacts and breaks sccache / cross-machine target/ reuse. The version string is ~12 bytes, machine-independent, and is already the natural single source of truth for both modes (embed-mode constants now consume it too, eliminating the prior risk of build-time / runtime version drift). 5. Atomic extraction (restored)extract_to_cache writes to a sibling - staging file and then fs::renames onto the final path. fs::rename of a file is atomic on both Unix and Windows, so a parallel cargo build on a fresh machine never observes a partially-written binary. Earlier commits in the PR had staging+rename and dropped it — putting it back. 6. Docs / terminology
TestsYour cli_resolution_test.rs is preserved and extended with two more cases:
Does this fit with what you need? |
|
yes, thanks! i actually had an unpushed commit doing some of that but yours is better! |
|
I'm going to fully round trip this with our client to make sure the dx lines up, but otherwise this is looking great! |
The download path emits a 'Downloading <url>' cargo warning on cache miss, but the extract step that follows is silent. With the separate download cache gone (extracted binary is the cache), a first build gave no surface telling the contributor where the ~160 MB binary landed. Mirror the existing warning style with a one-line 'Extracted Copilot CLI to <path>' after the atomic rename succeeds. Quiet on the hot path: caller short-circuits on is_file() so this only fires on a true cache miss. Co-authored-by: Copilot <[email protected]>
|
Small refinement: gate Observed while adopting this PR on the github-app consumer side: when this crate is built from a vendored slot (no sibling println!("cargo:rerun-if-changed=../nodejs/package-lock.json");Cargo treats a missing Suggested one-liner: only emit the lockfile rerun when the lockfile is actually present (i.e. when we're building inside the copilot-sdk repo, where it'd be the fallback source for version resolution): let lockfile = Path::new(env!("CARGO_MANIFEST_DIR")).join("../nodejs/package-lock.json");
if lockfile.exists() {
println!("cargo:rerun-if-changed={}", lockfile.display());
}
Net result: vendored consumers get true zero-cost rebuilds when nothing's changed; in-tree contributor builds still re-run when they bump the lockfile. |
Cargo treats a missing rerun-if-changed path as 'always rerun', so the unconditional 'cargo:rerun-if-changed=../nodejs/package-lock.json' declaration was forcing build.rs to re-run on every cargo build for consumers without a sibling nodejs/ directory (vendored slots, published crates). cli-version.txt is the source of truth in those deployments; the lockfile is only a fallback inside this repo. Now: declare the lockfile rerun only when the file actually exists. In-tree contributor builds keep re-running when @github/copilot is bumped via the lockfile; vendored consumers get true zero-cost rebuilds when nothing has changed. Co-authored-by: Copilot <[email protected]>
|
Fixed in
|
Implements the SDK side of the CLI provisioning redesign in github/github-app#5700. Dev consumers stop needing a separate CLI install, an env-var dance, or the binary-size hit of
bundled-cli. Release consumers behave exactly as before.Approach
One existing cargo feature (
bundled-cli, default on) selects between two modes that share the same per-version cache layout:build.rsdownloads + SHA-verifies the archive and embeds it viainclude_bytes!(). Runtime extracts on first call viasrc/embeddedcli.rs::install_at. Unchanged.build.rsdownloads + SHA-verifies the same archive and extracts the single binary directly into the per-user cache. Emitscargo:rustc-env=COPILOT_CLI_DEV_PATH=<path>andcargo:rustc-cfg=has_dev_cli. Runtime returns that path viaenv!().Both modes write to
<cache>/github-copilot-sdk/cli/<sanitized version>/<binary>. Extraction is idempotent (is_file()short-circuit + directfs::write+cfg(unix)chmod) and matches the embed-modeinstall()pattern that already ships to Windows.src/resolve.rsresolves the binary in this order:CliProgram::Path->COPILOT_CLI_PATH(warn on stale) -> embedded extract (whenbundled-cliis on) -> dev cache path (when off) ->Error::BinaryNotFound. No PATH scanning at any layer.Pin file
Version + per-asset SHA-256 hashes live in a single
rust/cli-version.txt, written byrust/scripts/snapshot-bundled-cli-version.sh. It is committed into published crates and into github-app's vendored slot. Format:build.rsreadscli-version.txtwhen present (published crates, vendored slots) and otherwise falls back to../nodejs/package-lock.jsonfor contributor builds inside this repo. Same convention as the .NET and Go SDKs.Committing the SHAs at publish time keeps the publish workflow as the supply-chain trust boundary: every consumer build verifies the downloaded archive against a hash captured before any later release-tag tampering could swap the archive and
SHA256SUMS.txttogether.Reviewer notes
ClientOptions::with_bundled_cli_extract_diris a no-op in dev mode. The binary's path is baked into the crate at build time; there's no embedded archive to re-extract. UseCliProgram::PathorCOPILOT_CLI_PATHto override at runtime.<cache>/github-copilot-sdk-{version}/to<cache>/github-copilot-sdk/cli/<version>/. Existing per-user installs become orphaned; harmless, users can clean whenever.BUNDLED_CLI_CACHE_DIRbetween thetest(--no-default-features, dev mode) andbundle(default features, embed) jobs so the archive doesn't download twice per matrix shard. Windows, macOS, and Linux are in the matrix for both.copilot/copilot.exe); the extraction code expects exactly that layout in both modes.## Embedded CLIhas a callout warning thatdefault-features = falseis dev-only (the resulting binary is not self-contained).Test plan
cargo test(default features): lib + 3 newcli_resolution_testcases.cargo test --no-default-features --features test-support: lib + 4 new cases (extra:dev_mode_extracted_binary_exists).cargo clippy --all-targets --features test-support -- -D warnings -D clippy::unwrap_used: clean.cargo fmt --check: clean.RUSTDOCFLAGS="-D warnings" cargo doc --no-deps --all-features: clean.cargo check --no-default-featurespopulates~/Library/Caches/github-copilot-sdk/cli/1.0.55-1/copilotand./copilot --versionruns.cli-version.txtreproduces the existing "Archive integrity check failed... could indicate a supply-chain attack" panic fromcached_download.The github-app consumer-side cleanup (removing
.sdk-cli-pin, the install script,ensure-copilot-cli.ts,dev_cli_program(), etc.) lands in a separate PR there.