Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion cmd/ephemerd/runtime_windows.go
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,10 @@ func startContainerRuntime(dataDir string, log *slog.Logger, linuxVMEnabled bool
DiskSizeGB: linuxVMDiskSizeGB,
DindEnabled: dindEnabled,
DindAllowPrivileged: dindAllowPrivileged,
Log: log,
// Share the host's data dir read-only so the in-VM ephemerd
// reads the same config.toml. See docs/arch/plan9-config-share.md.
HostDataDir: dataDir,
Log: log,
})
if err != nil {
log.Warn("Linux VM not started — Linux jobs will not be available on this host", "error", err)
Expand Down
154 changes: 154 additions & 0 deletions docs/arch/host-config-initrd.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,154 @@
# Host Config Delivery via Boot-Initrd Tail

> **Status: implemented.** The in-VM ephemerd reads the host's
> `config.toml`, delivered on every VM boot through the same
> runtime-generated initrd tail that carries `ephemerd-linux`. Adding a
> new in-VM-relevant config knob costs zero plumbing: edit the host's
> config.toml, restart ephemerd, the VM reboots and reads the same TOML.

## Context

Until now, every host-side setting that needed to take effect *inside
the Linux VM* required its own ad-hoc plumbing across the VM boundary:

1. A field on `vm.LinuxVMConfig` (`DindEnabled`, `DindAllowPrivileged`).
2. A kernel command-line parameter (`ephemerd.dind=1`,
`ephemerd.dind_allow_privileged=1`) appended by
`pkg/vm/linuxvm_windows.go`.
3. A parser for that parameter in the init script
(`mage/download/download.go`).
4. A CLI flag on `ephemerd serve` (`--dind-allow-privileged`) that
overrides the in-VM config.
5. A re-render of the init script + initrd, and a rebuild of the host
binary.

PRs #87 (metrics — needed `container_stats_interval` over the boundary
for the in-VM sampler) and #88 (dind allow-privileged plumbing) both
had to walk this path. The cost-per-knob is small but real, and the
pattern doesn't scale.

## The mechanism

ephemerd already rebuilds the boot initrd **on every VM start**:
`pkg/vm.buildBootInitrd` appends a small gzipped cpio tail containing
`/assets/ephemerd-linux` to the build-time base initrd. The kernel
concatenates initrd cpio archives, so files in the appended tail
override or add to the base. That is how a fresh `go build` of
ephemerd.exe delivers a new Linux binary into the VM without an
initrd rebuild.

This feature adds one more file to that tail: the host's
`config.toml`, when it exists, lands at `/assets/config.toml`. The
init script stages it to `/etc/ephemerd/config.toml` (mode 0600) and
passes `--config /etc/ephemerd/config.toml` to the in-VM `ephemerd
serve`. The in-VM daemon then reads the same TOML the host reads.

Because the tail is regenerated on every VM boot and a VM boot happens
on every ephemerd service start, "edit config.toml + restart the
service" is the complete update procedure. Same semantics as the host
daemon itself.

## Why not a live file share (Plan9 / virtio-fs)

The first draft of this feature exposed the host data dir to the VM as
a Hyper-V Plan9 share. It failed in two independent ways, the first of
which took down Linux CI on the dev rig for ~100 minutes:

1. **The HCS document was rejected at VM start** (`HcsStartComputeSystem:
HRESULT 0xc0370110`) — the `Plan9` device JSON we constructed did not
match the schema HCS expects at creation time. The VM never booted,
ephemerd logged a single WARN, and every `[self-hosted linux x64]`
job sat queued while the host poll loop skipped them with "OS labels
don't match this platform."
2. **More fundamentally, the guest could never have mounted it.**
Hyper-V serves Plan9 shares over **hvsock**, not virtio — there is no
virtio-9p device on HCS. Mainline `mount -t 9p` supports
`trans=virtio|tcp|fd|...` but has no hvsock transport; LCOW's GCS
daemon makes this work by opening an `AF_VSOCK` socket itself and
passing the fd via `trans=fd`. Replicating that means a vsock dialer
+ mount helper in the guest — real machinery, for a file we read
exactly once at boot.

A live share buys *continuous* visibility of host files. We need a
*boot-time snapshot* of one file. The initrd tail already exists, is
exercised on every boot, has no new kernel or transport surface, and
fails in exactly one obvious way (file missing → defaults).

virtio-fs is the natural choice on Apple Vz for the Darwin equivalent
(Vz exposes virtio-fs directly) — that remains the plan for macOS,
tracked separately.

## Security

- `config.toml` can contain webhook secrets. It is written into the
cpio tail with mode 0600 and staged in the VM at
`/etc/ephemerd/config.toml` with mode 0600, root-owned. Job
containers never see the VM's host rootfs — they get only the bind
mounts the runtime hands them.
- The boot initrd lives at `<data-dir>\vm\linux\initrd` on the host —
the same ACL domain as `config.toml` itself, so embedding the config
does not widen host-side exposure.
- The GitHub App private key is **not** carried into the VM:
`private_key_path` in config.toml names a file outside the data dir,
and only the TOML text crosses the boundary, not referenced files.
The in-VM worker (`--containerd-only`) never constructs a GitHub
client, so the path string sits inert.

## What the in-VM daemon actually reads

The worker-mode code path dereferences a narrow slice of the config:

- `[dind]` — `enabled`, `allow_privileged`, cache settings.
- `[runtime.rlimits]` — per-container nofile, etc.
- `[log]` — log level/format.

Everything else (`[github]`, `[runner.windows]`, `[metrics]`,
`[vm.linux]`, `[webhook]`, tunnels, repo lists) is parsed into the
in-memory config but never read in worker mode. Worker mode returns
before the metrics server, providers, scheduler, and VM-boot blocks in
`serve()`, so a host config with `[metrics] enabled = true` does NOT
start a second metrics listener inside the VM. Future changes to
worker mode should preserve that invariant — the host scrapes in-VM
container stats via the Dispatch stream (#87) precisely so the VM
needs no listener of its own.

## Fallback

When `config.toml` doesn't exist on the host (fresh install before
first write), `buildBootInitrd` skips the entry and the init script
sees no `/assets/config.toml` — the in-VM daemon runs on its compiled
defaults plus the kernel-cmdline `ephemerd.dind*` flags from #88,
which are retained for exactly this case. Once a config exists, the
TOML wins (the cmdline flags force the same values they always did,
and `--config` only adds settings the flags don't cover).

## Failure modes worth knowing

- **Host config unreadable** (ACL mishap): treated as missing —
defaults + cmdline. The init banner logs `host_config=` empty.
- **Malformed TOML on the host**: the host daemon itself fails to start
first (it parses the same file), so a broken config never reaches a
running VM in practice.
- **Operator edits config.toml while the VM is running**: not picked up
until the next VM boot. Restart the ephemerd service.
- **Secrets rotation**: same story — restart the service; the initrd
tail is regenerated with the new file on every boot.

## Lessons recorded

- **Deploying a draft build to the only Linux CI host turns "VM won't
boot" into "CI is silently down."** The only symptom was a DEBUG-level
skip log. Follow-up worth doing: a WARN (or health-endpoint signal)
when Linux-labeled jobs are queued but the Linux dispatcher is
unavailable.
- **HCS document changes need a boot test before deploy.** `0xc0370110`
arrives at start time, not at document-build time; nothing in `mage
ci` exercises it. A future smoke target that creates + starts a
minimal VM would catch this class.

## File pointers

- Tail construction: `pkg/vm/initrd_windows.go` (`buildBootInitrd`)
- Call site + config path resolution: `pkg/vm/linuxvm_windows.go`
- VM-side staging: init script in `mage/download/download.go`
- Field: `vm.LinuxVMConfig.HostDataDir` in `pkg/vm/vm.go`
21 changes: 20 additions & 1 deletion mage/download/download.go
Original file line number Diff line number Diff line change
Expand Up @@ -1618,8 +1618,27 @@ if [ "$DIND" = "1" ]; then
DIND_FLAG="$DIND_FLAG --dind-allow-privileged"
fi
fi
echo "ephemerd-init: launching ephemerd-linux (dind=$DIND allow_privileged=$DIND_ALLOW_PRIV)"

# Host config rides in via the runtime-generated initrd tail (the same
# mechanism that delivers ephemerd-linux — see pkg/vm.buildBootInitrd).
# When present, copy it into the VM rootfs and point ephemerd at it so
# every host-side setting (dind.*, runtime.rlimits, future knobs) takes
# effect on this VM boot with no per-setting plumbing. When absent
# (fresh install before config.toml exists), the kernel-cmdline flags
# above keep the in-VM daemon working on defaults.
# See docs/arch/host-config-initrd.md.
CONFIG_FLAG=""
if [ -f /assets/config.toml ]; then
mkdir -p /newroot/etc/ephemerd
cp /assets/config.toml /newroot/etc/ephemerd/config.toml
chmod 600 /newroot/etc/ephemerd/config.toml
CONFIG_FLAG="--config /etc/ephemerd/config.toml"
echo "ephemerd-init: host config staged at /etc/ephemerd/config.toml"
fi

echo "ephemerd-init: launching ephemerd-linux (dind=$DIND allow_privileged=$DIND_ALLOW_PRIV host_config=${CONFIG_FLAG:+yes})"
exec switch_root /newroot /usr/local/bin/ephemerd-linux serve \
$CONFIG_FLAG \
--data-dir /var/lib/ephemerd \
--containerd-tcp-port "$CONTAINERD_PORT" \
--containerd-tcp-addr 0.0.0.0 \
Expand Down
37 changes: 29 additions & 8 deletions pkg/vm/initrd_windows.go
Original file line number Diff line number Diff line change
Expand Up @@ -11,14 +11,17 @@ import (
)

// buildBootInitrd produces the initrd the VM actually boots with by appending
// a tiny cpio archive containing /assets/ephemerd-linux to the embedded base
// initrd. The Linux kernel concatenates initrd cpios into a single initramfs,
// so files in the appended cpio override (or add to) those in the base. This
// lets a fresh `go build` of ephemerd.exe deliver a new ephemerd-linux to the
// VM without any initrd rebuild — the build-time initrd contains only the
// boot scaffolding (busybox, modules, init script), and the binary itself
// rides in via the runtime-generated tail.
func buildBootInitrd(basePath, ephemerdLinuxPath, destPath string) error {
// a tiny cpio archive containing /assets/ephemerd-linux — and, when
// hostConfigPath is non-empty and readable, /assets/config.toml — to the
// embedded base initrd. The Linux kernel concatenates initrd cpios into a
// single initramfs, so files in the appended cpio override (or add to) those
// in the base. This lets a fresh `go build` of ephemerd.exe deliver a new
// ephemerd-linux to the VM without any initrd rebuild, and lets the host's
// config.toml reach the in-VM daemon on every boot with no per-setting
// plumbing — the build-time initrd contains only the boot scaffolding
// (busybox, modules, init script); the binary and config ride in via the
// runtime-generated tail.
func buildBootInitrd(basePath, ephemerdLinuxPath, hostConfigPath, destPath string) error {
baseData, err := os.ReadFile(basePath)
if err != nil {
return fmt.Errorf("reading base initrd: %w", err)
Expand All @@ -27,6 +30,16 @@ func buildBootInitrd(basePath, ephemerdLinuxPath, destPath string) error {
if err != nil {
return fmt.Errorf("reading ephemerd-linux: %w", err)
}
// Host config is best-effort: a missing config.toml (fresh install
// before first write, or tests) simply means the in-VM daemon runs on
// defaults + kernel-cmdline flags, same as before this feature.
var cfgData []byte
if hostConfigPath != "" {
cfgData, err = os.ReadFile(hostConfigPath)
if err != nil {
cfgData = nil
}
}

var tail bytes.Buffer
gw := gzip.NewWriter(&tail)
Expand All @@ -39,6 +52,14 @@ func buildBootInitrd(basePath, ephemerdLinuxPath, destPath string) error {
if err := writeCPIOEntry(gw, "assets/ephemerd-linux", 0o100755, binData, ""); err != nil {
return fmt.Errorf("cpio: ephemerd-linux: %w", err)
}
if cfgData != nil {
// 0600: config.toml can carry webhook secrets. Inside the VM it's
// only readable by root, and job containers never see the host
// rootfs — but no reason to be sloppy with the mode.
if err := writeCPIOEntry(gw, "assets/config.toml", 0o100600, cfgData, ""); err != nil {
return fmt.Errorf("cpio: config.toml: %w", err)
}
}
if err := writeCPIOEntry(gw, "TRAILER!!!", 0, nil, ""); err != nil {
return fmt.Errorf("cpio: trailer: %w", err)
}
Expand Down
82 changes: 79 additions & 3 deletions pkg/vm/initrd_windows_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ func TestBuildBootInitrd_AppendsEphemerdLinux(t *testing.T) {
}

destPath := filepath.Join(dir, "initrd")
if err := buildBootInitrd(basePath, binPath, destPath); err != nil {
if err := buildBootInitrd(basePath, binPath, "", destPath); err != nil {
t.Fatalf("buildBootInitrd: %v", err)
}

Expand Down Expand Up @@ -137,7 +137,7 @@ func TestBuildBootInitrd_MissingBase(t *testing.T) {
if err := os.WriteFile(binPath, []byte("data"), 0o755); err != nil {
t.Fatalf("writing binary: %v", err)
}
err := buildBootInitrd(filepath.Join(dir, "missing-base"), binPath, filepath.Join(dir, "out"))
err := buildBootInitrd(filepath.Join(dir, "missing-base"), binPath, "", filepath.Join(dir, "out"))
if err == nil {
t.Error("expected error for missing base initrd")
}
Expand All @@ -149,12 +149,88 @@ func TestBuildBootInitrd_MissingBinary(t *testing.T) {
if err := writeGzippedCPIO(basePath, map[string][]byte{"x": []byte("y")}); err != nil {
t.Fatalf("writing base: %v", err)
}
err := buildBootInitrd(basePath, filepath.Join(dir, "missing-bin"), filepath.Join(dir, "out"))
err := buildBootInitrd(basePath, filepath.Join(dir, "missing-bin"), "", filepath.Join(dir, "out"))
if err == nil {
t.Error("expected error for missing ephemerd-linux")
}
}

func TestBuildBootInitrd_AppendsHostConfig(t *testing.T) {
dir := t.TempDir()
basePath := filepath.Join(dir, "initrd-base")
if err := writeGzippedCPIO(basePath, map[string][]byte{"x": []byte("y")}); err != nil {
t.Fatalf("writing base: %v", err)
}
binPath := filepath.Join(dir, "ephemerd-linux")
if err := os.WriteFile(binPath, []byte("elf"), 0o755); err != nil {
t.Fatalf("writing binary: %v", err)
}
cfgPath := filepath.Join(dir, "config.toml")
cfgBody := []byte("[dind]\nenabled = true\nallow_privileged = true\n")
if err := os.WriteFile(cfgPath, cfgBody, 0o600); err != nil {
t.Fatalf("writing config: %v", err)
}

destPath := filepath.Join(dir, "initrd")
if err := buildBootInitrd(basePath, binPath, cfgPath, destPath); err != nil {
t.Fatalf("buildBootInitrd: %v", err)
}

got, err := os.ReadFile(destPath)
if err != nil {
t.Fatalf("reading boot initrd: %v", err)
}
baseData, err := os.ReadFile(basePath)
if err != nil {
t.Fatalf("reading base: %v", err)
}
gr, err := gzip.NewReader(bytes.NewReader(got[len(baseData):]))
if err != nil {
t.Fatalf("appended tail is not gzip: %v", err)
}
defer func() { _ = gr.Close() }()
cpio, err := io.ReadAll(gr)
if err != nil {
t.Fatalf("reading appended cpio: %v", err)
}
if !bytes.Contains(cpio, []byte("assets/config.toml")) {
t.Error("appended cpio does not contain assets/config.toml path")
}
if !bytes.Contains(cpio, cfgBody) {
t.Error("appended cpio does not contain config body")
}
}

// TestBuildBootInitrd_MissingHostConfigIsNotFatal asserts the no-config
// path: a fresh install where config.toml doesn't exist yet must still
// produce a bootable initrd (in-VM daemon runs on defaults + cmdline).
func TestBuildBootInitrd_MissingHostConfigIsNotFatal(t *testing.T) {
dir := t.TempDir()
basePath := filepath.Join(dir, "initrd-base")
if err := writeGzippedCPIO(basePath, map[string][]byte{"x": []byte("y")}); err != nil {
t.Fatalf("writing base: %v", err)
}
binPath := filepath.Join(dir, "ephemerd-linux")
if err := os.WriteFile(binPath, []byte("elf"), 0o755); err != nil {
t.Fatalf("writing binary: %v", err)
}

destPath := filepath.Join(dir, "initrd")
if err := buildBootInitrd(basePath, binPath, filepath.Join(dir, "nonexistent-config.toml"), destPath); err != nil {
t.Fatalf("buildBootInitrd should tolerate a missing host config: %v", err)
}
got, err := os.ReadFile(destPath)
if err != nil {
t.Fatalf("reading boot initrd: %v", err)
}
if bytes.Contains(got[len(got)/2:], []byte("assets/config.toml")) {
// Cheap sanity: the tail shouldn't reference a config we never had.
// (Scan the back half only — the base could theoretically contain
// the string, though our fixture doesn't.)
t.Error("initrd tail references assets/config.toml despite missing source")
}
}

// writeGzippedCPIO is a test helper that emits a tiny valid gzipped newc cpio
// archive containing the given files.
func writeGzippedCPIO(path string, files map[string][]byte) error {
Expand Down
Loading