Skip to content

fix(dind): scope built image tags per job (latent concurrent-build race)#90

Merged
luthermonson merged 1 commit into
mainfrom
fix/dind-build-namespace-isolation
Jun 10, 2026
Merged

fix(dind): scope built image tags per job (latent concurrent-build race)#90
luthermonson merged 1 commit into
mainfrom
fix/dind-build-namespace-isolation

Conversation

@luthermonson

@luthermonson luthermonson commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

Summary

Fixes a latent concurrent-build race in the dind layer: all jobs share one BuildKit worker writing into one flat "buildkit" containerd namespace, keyed by raw user tag. Two concurrent jobs that both docker build -t same:tag overwrite each other's image record — last build wins, and the losing job's subsequent docker push ships the other job's bytes.

Provenance note: this was found while investigating an ephpm E2E matrix failure that looked exactly like this race (PHP 8.4 job serving the 8.5 binary). That incident turned out to be unrelated — the php-sdk release artifacts were mispackaged upstream (spc download --with-php=8.4.21 silently falling back to latest stable), and ephpm's images flow via buildx --output dest=<tarball> + kind load image-archive, never transiting the shared store. No active failure is attributed to this bug. The race is nonetheless real and provable from the code, so the fix stands on its own.

The race (from code, not from an incident)

  • cmd/ephemerd/main.go constructs one buildkit.Server with ContainerdNamespace: "buildkit" and hands it to every per-job dind server.
  • pkg/dind/buildkit_build.go exported builds under the raw user tag — docker build -t foo:dev from any job writes image record foo:dev in the shared namespace.
  • pkg/dind/registry.go (docker push) reads tags back from the same flat namespace.

containerd image records are name → digest. Concurrent same-tag builds = last writer wins = cross-job image substitution for any workflow that builds and pushes through the dind socket. ephemerd's per-job isolation (container namespaces ephemerd-dind-<job-id>) covers containers and execs but never covered built tags.

Fix

Built tags are stored under job-scoped names inside the shared namespace:

foo:dev → build.ephemerd.local/<job-id>/foo:dev
  • Invisible to workflows — the job's docker CLI keeps using its own tag; only the storage name carries the scope.
  • docker push applies the same transform on lookup: scoped candidates first, then unscoped fallbacks (covers images staged into the namespace by tests/tooling).
  • Cross-job resolution is impossible by construction: job A's lookup candidates are A-scoped + unscoped; job B's records are B-scoped. No overlap.
  • build.ephemerd.local is a synthetic registry hostname that never resolves — it exists to keep scoped names valid under the Docker reference grammar (BuildKit's exporter validates refs).

dind topology answers (for the record)

The investigation asked four questions worth answering somewhere durable:

  1. Daemon topology: per-job daemons — each job gets its own dind.Server, socket, and containerd namespace (ephemerd-dind-<job-id>). The backing containerd and the BuildKit worker are VM-global.
  2. Namespacing: containers, execs, and in-memory image lists were already per-job. Built image tags were the gap (fixed here). The per-(provider, repo) image cache namespace is intentionally shared across jobs of the same repo — pulls only, read-through.
  3. dind.allow_privileged: only opens the privilege gate on the per-job daemon; no isolation topology change either way.
  4. Serialization: not needed once tags are scoped.

Tests

  • Scoping table: registry-qualified refs, case-sensitive tags preserved, uppercase job IDs lowercased, underscored job IDs, empty-input passthrough.
  • Reference-grammar validation: every scoped form parses under distribution/reference (BuildKit rejects invalid refs).
  • Export-attr scoping for multi-tag builds (-t a -t b).
  • Push candidate ordering (scoped before unscoped).
  • The race as a property test: same tag + different jobs → distinct storage names.
  • Existing registry e2e exercises the unscoped fallback unchanged.

Test plan

  • CI (lint + unit + e2e) green
  • docker build + docker push to a real registry from a single job still round-trips on a deployed build
  • Two concurrent jobs building the same tag produce independent pushes (synthetic check; no known real workload does this today)

All jobs share one BuildKit worker writing into one containerd
namespace ("buildkit"), keyed by tag. Two concurrent jobs that both
`docker build -t ephpm:dev` raced on the same image record — last
build wins, and the losing job pushed/loaded the other job's binary.
Observed in the wild as an E2E matrix job asserting on PHP 8.4 and
getting the 8.5 build (ephpm/ephpm#67, #68, reproduced on main).

Fix: built tags are now stored under job-scoped names inside the
shared namespace:

  ephpm:dev -> build.ephemerd.local/<job-id>/ephpm:dev

The transform is invisible to workflows — each job's docker CLI keeps
its own tag; only the storage name carries the scope. docker push
applies the same transform on lookup (scoped candidates first, then
unscoped fallbacks for images staged by tests/tooling). Cross-job
resolution is impossible by construction: job A's candidates are
A-scoped + unscoped, which can never match job B's B-scoped records.

The synthetic build.ephemerd.local registry hostname never resolves on
any network — it exists to keep scoped names valid under the Docker
reference grammar (BuildKit's exporter validates refs).

Tests: scoping table (registry-qualified refs, case-sensitive tags,
underscored job IDs), reference-grammar validation of scoped names,
export-attr scoping, push candidate ordering, and the race condition
expressed as a property (same tag + different jobs = distinct names).
The existing registry e2e exercises the unscoped fallback unchanged.
@luthermonson luthermonson changed the title fix(dind): scope built image tags per job to stop concurrent-build races fix(dind): scope built image tags per job (latent concurrent-build race) Jun 10, 2026
@luthermonson luthermonson merged commit 77991e5 into main Jun 10, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant