Skip to content

CI: Windows e2e shards fail — Cypress binary missing on actions/cache miss (pnpm skips postinstall) #2957

Description

@dokterbob

Summary

All Windows e2e shards fail with "The cypress npm package is installed, but the Cypress binary is missing" before a single test runs. This blocks the required Run CI gate on every PR that touches the codebase, including PRs with zero Windows-specific changes.

Evidence

Two independent runs on unrelated branches show the identical signature:

Run Branch windows-latest-5 Other Windows shards Ubuntu shards
27355767547 fix/oauth-error-ux-1273 (PR #2955) FAIL Cancelled (fail-fast) All pass
27305937906 feat/compact-cot-display FAIL Cancelled (fail-fast) All pass

The real failure is always on windows-latest-5 (the first shard to reach the test step); the other four Windows shards show The operation was canceled — that is the GitHub Actions fail-fast matrix cancellation, not their own failure. The test never executes.

Error from windows-latest-5:

The cypress npm package is installed, but the Cypress binary is missing.
We expected the binary to be installed here:
D:\a\chainlit\chainlit\.cypress-cache\14.5.3\Cypress\Cypress.exe
 ELIFECYCLE  Command failed with exit code 1.

Root cause

.github/workflows/e2e-tests.yaml sets a custom binary location:

# line 45
CYPRESS_CACHE_FOLDER: ${{ github.workspace }}/.cypress-cache

and restores it with actions/cache keyed on the lockfile hash (lines 48–52):

- name: Cache Cypress binary
  uses: actions/cache@v5
  with:
    path: .cypress-cache
    key: cypress-${{ runner.os }}-${{ hashFiles('pnpm-lock.yaml') }}

On a cache miss (new runner image, new lockfile, cold cache), the workflow then runs pnpm install via ./.github/actions/pnpm-node-install (line 53). Because pnpm finds the cypress package already in its own store, it reuses it and skips the postinstall script — so the Cypress binary is never downloaded into CYPRESS_CACHE_FOLDER. pnpm test:e2e (line 65) then calls cypress run, which finds the package but not the binary and aborts.

The recent windows-latestwindows-2025 runner migration (GitHub notice: redirect by 2026-06-15) changed cache-hit/miss behavior on Windows and exposed this latent bug. Ubuntu runners happen to hit the cache more reliably, masking the same issue there.

Local reproduction (OS-agnostic)

The error is not Windows-specific — it fires whenever CYPRESS_CACHE_FOLDER points at a directory without the binary, regardless of OS:

# Mirror the CI cache-miss condition — point at an empty dir
CYPRESS_CACHE_FOLDER="$PWD/.cypress-cache-repro" pnpm exec cypress run \
  --spec cypress/e2e/oauth_auth/spec.cy.ts
# Output:
# No version of Cypress is installed in: .../.cypress-cache-repro/14.5.3/Cypress.app
# Cypress executable not found at: .../.cypress-cache-repro/14.5.3/Cypress.app/.../Cypress

Confirmed locally on macOS (darwin-arm64, Cypress 14.5.3). The proposed fix also validates locally:

CYPRESS_CACHE_FOLDER="$PWD/.cypress-cache-repro" pnpm exec cypress install
# [SUCCESS] Finished Installation   .../.cypress-cache-repro/14.5.3

CYPRESS_CACHE_FOLDER="$PWD/.cypress-cache-repro" pnpm exec cypress verify
# [SUCCESS] Verified Cypress!       .../.cypress-cache-repro/14.5.3/Cypress.app

rm -rf .cypress-cache-repro

Proposed fixes (ranked)

Fix 1 — Explicit cypress install step (recommended)

Add an idempotent pnpm exec cypress install step in e2e-tests.yaml after dependency install and before pnpm test:e2e. This is the Cypress-recommended CI pattern. It is a no-op when the cache hits, and repairs the binary when it doesn't.

      - uses: ./.github/actions/pnpm-node-install
        name: Install Node, pnpm and dependencies.
      # ADD THIS:
      - name: Ensure Cypress binary is installed
        run: pnpm exec cypress install
        shell: bash
      - name: Verify Cypress binary
        run: pnpm exec cypress verify
        shell: bash

Fix 2 — Adopt cypress-io/github-action

Replace the manual install + pnpm test:e2e with cypress-io/github-action, which correctly manages binary installation and caching on all platforms.

Fix 3 — Force pnpm to run the postinstall

After pnpm install, run pnpm rebuild cypress (or allow the build script via pnpm install --unsafe-perm) so pnpm's postinstall downloads the binary even on a store hit.

Secondary improvement — disable fail-fast

The e2e matrix currently has no fail-fast: false:

strategy:
  matrix:
    os: [ubuntu-latest, windows-latest]
    containers: ${{ fromJSON(needs.prepare.outputs.indexes) }}

When windows-latest-5 fails, the other four Windows shards are cancelled, masking their individual results. Adding fail-fast: false would let all shards complete independently and surface the real per-shard status:

strategy:
  fail-fast: false
  matrix:
    ...

Acceptance criteria

  • Windows e2e shards reach and run specs (or at least pass the binary check) on a clean cache.
  • The Run CI gate goes green for PRs without Windows-specific code changes.
  • (Optional) All 5 Windows shards report their own status rather than cancelled.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingdev-toolingRelating to developer/contributor toolings.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions