guest/stdio: ConnSlot for container stdio survival across live migration#2721
Open
shreyanshjain7174 wants to merge 1 commit intomicrosoft:mainfrom
Open
guest/stdio: ConnSlot for container stdio survival across live migration#2721shreyanshjain7174 wants to merge 1 commit intomicrosoft:mainfrom
shreyanshjain7174 wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…gration
Wrap every container stdio vsock connection in a ConnSlot so the
underlying conn can be replaced when the bridge reconnects after live
migration.
When the bridge dies, cmd/gcs/main.go calls Host.DisconnectAllStdio,
which Disconnects every tracked slot. The relays (PipeRelay / TtyRelay)
park inside ConnSlot.acquire on a sync.Cond. The producing process keeps
writing into its kernel pipe (~64 KiB buffer) and back-pressures
naturally on its next write syscall when the buffer fills, so no bytes
are lost.
A background runRedial re-dials the same vsock port via a transport-
agnostic redialer callback. On success it calls Set, which broadcasts
the cond and wakes the parked relays against the fresh connection.
runRedial is bounded (60 attempts at 100 ms = ~6 s) so a permanently
broken peer cannot pin a goroutine forever.
Wiring:
- Container holds a slotRegistry interface (one method, narrow seam),
populated with the parent Host. Container.Start, ExecProcess, and
Host.runExternalProcess register their stdio set after stdio.Connect.
- Host.stdioSlots is a slice guarded by containersMutex; closed slots
are compacted on every register call so the slice doesn't grow with
container churn.
- stdio.Connect uses tport.Dial (not DialReconn) so dial errors
propagate to runRedial instead of being silently retried forever.
Coverage:
- 15 ConnSlot unit tests (block/resume on Set, idempotent Disconnect /
Close, Set-after-Close, redial bounded + Disconnect re-triggers,
concurrent Disconnect with Writes under -race, pipe-relay back-
pressure integration).
- 6 Host registry unit tests (tracking, nil/non-slot ignore, nil set,
DisconnectAll closes underlying conns, closed-slot compaction).
- End-to-end PowerShell test on the two-node LM bench: post-migration
azcrictl exec stdout streams a monotonic heartbeat counter to the
host, and azcrictl attach is responsive (was hanging before).
Signed-off-by: Shreyansh Sancheti <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
ConnSlotis atransport.Connectionwrapper that lets the underlyingvsock conn be replaced at runtime, so container stdio survives a bridge
disconnect during live migration.
Why
When the bridge died during LM, stdio relays were holding raw vsock fds
that the host had just torn down. The relays errored out, in-flight bytes
were lost, and
azcrictl attachhung after migration.How
ConnSlot.Read/Writedelegate under a mutex; if the slot is emptythey park on a
sync.Cond. The relay parks before consuming the nextbyte from the upstream pipe, the kernel pipe (~64 KiB) fills, and the
producer process blocks on its next
write(2)— natural back-pressurewith no user-space buffering.
Disconnectclears the conn and kicksrunRedialin a goroutine.runRedialre-dials the same vsock port via a transport-agnosticredialer; on success it calls
Set, which broadcasts the cond andwakes blocked relays against the new fd.
broken peer can't pin a goroutine.
Wiring
Containerholds a one-methodslotRegistryinterface (narrow seam,not a
*Hostback-pointer).Container.Start,ExecProcess, andHost.runExternalProcessregister their stdio set afterstdio.Connect.Host.stdioSlotsis a slice guarded bycontainersMutex. Closedslots are compacted on every register call so the slice is bounded by
the live-process count.
cmd/gcs/main.gocallsh.DisconnectAllStdio()once per bridgedisconnect cycle, after
b.ListenAndServereturns.Tests
Unit (
go test -race -count=1, all 21 pass):ConnSlottests ininternal/guest/stdio/connslot_test.goHostregistry tests ininternal/guest/runtime/hcsv2/stdio_slots_test.goEnd-to-end on the two-node LM bench:
azcrictl execproduces a heartbeat counter thatincrements monotonically and arrives at the host (5+ counters
captured on the verified runs).
azcrictl attachconnects, streams bytes, and exitscleanly when killed (was hanging before).