feat(vm): plumb dind.allow_privileged from host config into the VM#88
Merged
Conversation
The host's `[dind] allow_privileged = true` setting was silently dropped
on the way into the embedded Linux VM. The in-VM ephemerd reads its own
(default) config inside /var/lib/ephemerd and falls back to the Linux
default of false, rejecting `docker run --privileged` siblings even when
the host operator explicitly opted in. ephpm-style workloads that need
KIND (privileged containers) couldn't run.
Plumb the host's `cfg.Dind.ResolvedAllowPrivileged()` through:
1. main.go → startContainerRuntime → LinuxVMConfig.DindAllowPrivileged
2. linuxvm_windows.go appends `ephemerd.dind_allow_privileged=1` to the
kernel cmdline when set
3. The in-initrd init script parses the new param and adds
`--dind-allow-privileged` to the in-VM `ephemerd-linux serve` call
4. A new `--dind-allow-privileged` CLI flag on `ephemerd serve` forces
`cfg.Dind.AllowPrivileged = true`, overriding the in-VM config file
Also fixes a latent bug in mage/download/download.go: Initrdx86's
`outOfDate` input list didn't include download.go itself, so edits to
the embedded init script body were silently skipped by `mage
build:windows` (we burned ~30 minutes on this today). Adding the file as
an input makes init-script edits invalidate the cached initrd correctly.
Verified end-to-end on the live rig: kernel cmdline carries
ephemerd.dind_allow_privileged=1, init banner shows dind_allow_privileged=1,
serve invocation logs `(dind=1 allow_privileged=1)`, and the dind
rejection warnings ("rejecting elevated container request") stop firing.
Note: the Darwin Initrd() function has a similar cache pattern using
fileExists rather than outOfDate — same class of bug, deferred to a
follow-up.
5 tasks
luthermonson
added a commit
that referenced
this pull request
Jun 11, 2026
…kes effect on next boot) The host's data dir (where config.toml lives) is now exposed read-only to the Linux VM as a Hyper-V Plan9 share named "ephemerd-host-config". The init script mounts it at /mnt/host-config and points the in-VM `ephemerd serve` at the host's config.toml via --config. Adding a new in-VM-relevant setting (dind, runtime.rlimits, future knobs) now costs zero plumbing: write to config.toml on the host, restart ephemerd, the VM reboots and reads the same TOML. Why Plan9: the kernel surface (CONFIG_9P_FS, CONFIG_NET_9P_VIRTIO) was already compiled into our virt kernel and the modules were already listed in initrdKernelModulesX86. Someone wired the guest side but never the host. This connects the dots. Security boundary: the share is read-only — a compromised in-VM ephemerd cannot mutate the host. Job containers never see the share (they get only the runtime's explicit bind mounts). Fallback: when the share fails to mount (stripped kernel without 9p, share not exported, etc.) the init script logs a warning and falls back to today's behavior. The kernel-cmdline ephemerd.dind* params introduced in #88 are deliberately retained as that fallback path — they're redundant when the share is healthy. Doc: docs/arch/plan9-config-share.md. Not in scope: macOS Vz (different mechanism — virtio-fs; symmetric work, separate PR), Linux host (no VM to share with).
luthermonson
added a commit
that referenced
this pull request
Jun 11, 2026
…Plan9) Reworks the host-config delivery away from the Hyper-V Plan9 share, which failed twice over: HCS rejected the Plan9 device JSON at VM start (HcsStartComputeSystem: 0xc0370110 — took down Linux CI on the dev rig until rollback), and even with a valid document the guest could never mount it — Hyper-V serves Plan9 over hvsock, not virtio, and mainline mount -t 9p has no hvsock transport (LCOW's GCS does an AF_VSOCK + trans=fd dance in userspace to make it work). A live share buys continuous file visibility; we need a boot-time snapshot of one file. So: ride config.toml in via the runtime-generated initrd tail, exactly like ephemerd-linux already does. buildBootInitrd appends /assets/config.toml (mode 0600) when the host file exists; the init script stages it to /etc/ephemerd/config.toml and passes --config. The tail regenerates on every VM boot, so "edit config.toml + restart the service" is the complete update procedure — same semantics the Plan9 share would have given, zero new kernel or transport surface. Missing config.toml is non-fatal (fresh installs run on defaults + the ephemerd.dind* cmdline flags from #88, retained for that case). The arch doc (docs/arch/host-config-initrd.md) keeps a post-mortem of the Plan9 attempt, including two follow-ups: a louder signal when Linux-labeled jobs are queued but the VM failed to boot (the outage's only symptom was a DEBUG skip log), and a smoke test that actually starts a minimal HCS VM (0xc0370110 only appears at start time; nothing in mage ci exercises it). Verified on the live rig: VM boots, init logs "host config staged at /etc/ephemerd/config.toml", launch banner shows host_config=yes, in-VM worker reads the host's [dind] section.
luthermonson
added a commit
that referenced
this pull request
Jun 11, 2026
…tail (#89) * feat(vm): share host data dir into Linux VM via Plan9 (host config takes effect on next boot) The host's data dir (where config.toml lives) is now exposed read-only to the Linux VM as a Hyper-V Plan9 share named "ephemerd-host-config". The init script mounts it at /mnt/host-config and points the in-VM `ephemerd serve` at the host's config.toml via --config. Adding a new in-VM-relevant setting (dind, runtime.rlimits, future knobs) now costs zero plumbing: write to config.toml on the host, restart ephemerd, the VM reboots and reads the same TOML. Why Plan9: the kernel surface (CONFIG_9P_FS, CONFIG_NET_9P_VIRTIO) was already compiled into our virt kernel and the modules were already listed in initrdKernelModulesX86. Someone wired the guest side but never the host. This connects the dots. Security boundary: the share is read-only — a compromised in-VM ephemerd cannot mutate the host. Job containers never see the share (they get only the runtime's explicit bind mounts). Fallback: when the share fails to mount (stripped kernel without 9p, share not exported, etc.) the init script logs a warning and falls back to today's behavior. The kernel-cmdline ephemerd.dind* params introduced in #88 are deliberately retained as that fallback path — they're redundant when the share is healthy. Doc: docs/arch/plan9-config-share.md. Not in scope: macOS Vz (different mechanism — virtio-fs; symmetric work, separate PR), Linux host (no VM to share with). * feat(vm): deliver host config.toml via boot-initrd tail (rework from Plan9) Reworks the host-config delivery away from the Hyper-V Plan9 share, which failed twice over: HCS rejected the Plan9 device JSON at VM start (HcsStartComputeSystem: 0xc0370110 — took down Linux CI on the dev rig until rollback), and even with a valid document the guest could never mount it — Hyper-V serves Plan9 over hvsock, not virtio, and mainline mount -t 9p has no hvsock transport (LCOW's GCS does an AF_VSOCK + trans=fd dance in userspace to make it work). A live share buys continuous file visibility; we need a boot-time snapshot of one file. So: ride config.toml in via the runtime-generated initrd tail, exactly like ephemerd-linux already does. buildBootInitrd appends /assets/config.toml (mode 0600) when the host file exists; the init script stages it to /etc/ephemerd/config.toml and passes --config. The tail regenerates on every VM boot, so "edit config.toml + restart the service" is the complete update procedure — same semantics the Plan9 share would have given, zero new kernel or transport surface. Missing config.toml is non-fatal (fresh installs run on defaults + the ephemerd.dind* cmdline flags from #88, retained for that case). The arch doc (docs/arch/host-config-initrd.md) keeps a post-mortem of the Plan9 attempt, including two follow-ups: a louder signal when Linux-labeled jobs are queued but the VM failed to boot (the outage's only symptom was a DEBUG skip log), and a smoke test that actually starts a minimal HCS VM (0xc0370110 only appears at start time; nothing in mage ci exercises it). Verified on the live rig: VM boots, init logs "host config staged at /etc/ephemerd/config.toml", launch banner shows host_config=yes, in-VM worker reads the host's [dind] section.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The host's
[dind] allow_privileged = truesetting never crossed the VM boundary. The in-VM ephemerd reads its own (default) config file inside/var/lib/ephemerdand falls back to the Linux default offalse, so privileged sibling containers were rejected even when the operator explicitly opted in. ephpm-style workloads that need KIND (privileged) couldn't run regardless of host config.Fix
Plumb the host's resolved value across the boundary via the kernel cmdline:
cmd/ephemerd/main.goreadscfg.Dind.ResolvedAllowPrivileged()and threads it throughstartContainerRuntimeintovm.LinuxVMConfig.DindAllowPrivileged.pkg/vm/linuxvm_windows.goappendsephemerd.dind_allow_privileged=1to the kernel cmdline when set.mage/download/download.go) parses the new param and adds--dind-allow-privilegedto the in-VMephemerd-linux serveinvocation.--dind-allow-privilegedCLI flag onephemerd serveforcescfg.Dind.AllowPrivileged = true, overriding the in-VM config file.Cache-invalidation bug fixed in the same PR
Initrdx86'soutOfDateinput list only watched the rootfs tarball — edits to the embedded init script body indownload.goitself were silently skipped bymage build:windows, embedding a stale init script in a fresh binary. Addingmage/download/download.goas an input fixes this.(The Darwin
Initrd()function usesfileExistsinstead ofoutOfDate, so it has the same class of bug in a different form. Out of scope here; flagging for a follow-up.)Verified
End-to-end on the live rig:
ephemerd.dind_allow_privileged=1when host config has the flag.ephemerd-init: containerd_port=10000 root_disk=/dev/sda dind=1 dind_allow_privileged=1.launching ephemerd-linux (dind=1 allow_privileged=1).Test plan
mage ci(lint + tests) passes[dind] allow_privileged = trueon the host, restart ephemerddocker run --privilegedor similar from inside a jobrejecting elevated container requestwarning invm/linux/console.log)allow_privilegedunset on a fresh host, confirm Linux default offalsestill rejects (regression check)Future work
This is the second ad-hoc kernel-cmdline plumbing for an in-VM setting (after
dind=1). At a third we should swap to a real config-share mechanism — Hyper-V's Plan9 share infrastructure already exists inhcs_windows.gobut isn't wired. Branchfeat/plan9-config-shareexists for that follow-up.