OCPBUGS-77307: Generate KubeVirt nmstate network config conditionally#8365
OCPBUGS-77307: Generate KubeVirt nmstate network config conditionally#8365qinqon wants to merge 3 commits into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@qinqon: This pull request references Jira Issue OCPBUGS-77307, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
📝 WalkthroughWalkthroughThe PR adds platform-specific MachineConfig generation for KubeVirt to the NodePool config assembly. Sequence Diagram(s)sequenceDiagram
participant NodePool
participant ConfigGen as generateMCORawConfig
participant PlatformGen as kubevirtPlatformConfig
participant MachineConfig
participant MCO
NodePool->>ConfigGen: request raw MCO config
ConfigGen->>PlatformGen: getPlatformConfigs(nodePool)
alt Platform is KubeVirt and AttachDefaultNetwork true/nil
PlatformGen->>PlatformGen: build Ignition with nmstate files
PlatformGen->>MachineConfig: serialize MachineConfig (network)
else Platform is KubeVirt and multus primary
PlatformGen->>PlatformGen: build no-op nmstate override
PlatformGen->>MachineConfig: serialize MachineConfig (override)
else Other platform
PlatformGen-->>ConfigGen: return empty
end
PlatformGen-->>ConfigGen: return ConfigMap/MachineConfig YAML
ConfigGen->>MachineConfig: append platform config
ConfigGen-->>NodePool: return combined raw config
NodePool->>MCO: apply MachineConfig YAML
🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #8365 +/- ##
==========================================
+ Coverage 45.84% 45.94% +0.10%
==========================================
Files 440 441 +1
Lines 52824 52968 +144
==========================================
+ Hits 24218 24338 +120
- Misses 26816 26832 +16
- Partials 1790 1798 +8
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
10f4aad to
4cdc740
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (1)
hypershift-operator/controllers/nodepool/kubevirt/network.go (1)
225-230: Inconsistent YAML serialization approach.
GenerateNetworkMachineConfigusesapi.CompatibleYAMLEncode(line 111) while this function usesapi.YamlSerializer.Encodedirectly. This inconsistency could lead to subtle differences in output format and potentially affect hash stability.♻️ Suggested fix to use consistent serialization
- buf := &bytes.Buffer{} - if err := api.YamlSerializer.Encode(mc, buf); err != nil { + encoded, err := api.CompatibleYAMLEncode(mc, api.YamlSerializer) + if err != nil { return "", fmt.Errorf("failed to serialize kubevirt network override machine config: %w", err) } - return buf.String(), nil + return string(encoded), nil🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hypershift-operator/controllers/nodepool/kubevirt/network.go` around lines 225 - 230, The serialization in this function uses api.YamlSerializer.Encode directly which is inconsistent with GenerateNetworkMachineConfig that uses api.CompatibleYAMLEncode; update this function to call api.CompatibleYAMLEncode when encoding the machine config (mc) into the buffer (buf) so the output format and hash stability match the other code path, and propagate any returned error in the same manner as the existing fmt.Errorf wrapping.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@hypershift-operator/controllers/nodepool/kubevirt/network.go`:
- Around line 225-230: The serialization in this function uses
api.YamlSerializer.Encode directly which is inconsistent with
GenerateNetworkMachineConfig that uses api.CompatibleYAMLEncode; update this
function to call api.CompatibleYAMLEncode when encoding the machine config (mc)
into the buffer (buf) so the output format and hash stability match the other
code path, and propagate any returned error in the same manner as the existing
fmt.Errorf wrapping.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 39e42667-f313-4562-94f9-4339d98c2375
📒 Files selected for processing (3)
hypershift-operator/controllers/nodepool/config.gohypershift-operator/controllers/nodepool/kubevirt/network.gohypershift-operator/controllers/nodepool/kubevirt/network_test.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
hypershift-operator/controllers/nodepool/kubevirt/network.go (1)
225-228: Inconsistent YAML serialization method.
GenerateNetworkMachineConfig(line 111) usesapi.CompatibleYAMLEncode(mc, api.YamlSerializer)while this function usesapi.YamlSerializer.Encode(mc, buf)directly. Both functions generate the same object type (MachineConfig) and should use consistent serialization to ensure identical YAML formatting behavior.♻️ Proposed fix for consistency
- buf := &bytes.Buffer{} - if err := api.YamlSerializer.Encode(mc, buf); err != nil { + encoded, err := api.CompatibleYAMLEncode(mc, api.YamlSerializer) + if err != nil { return "", fmt.Errorf("failed to serialize kubevirt network override machine config: %w", err) } - return buf.String(), nil + return string(encoded), nilAfter applying this change, the
bytesimport on line 4 can be removed if it's no longer used elsewhere in the file.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hypershift-operator/controllers/nodepool/kubevirt/network.go` around lines 225 - 228, The YAML serialization in this function is inconsistent with GenerateNetworkMachineConfig: replace the manual bytes.Buffer + api.YamlSerializer.Encode(mc, buf) pattern with the same helper call used elsewhere — api.CompatibleYAMLEncode(mc, api.YamlSerializer) — so mc (the MachineConfig) is encoded with the same formatting behavior; remove the now-unused bytes import if it is no longer referenced after the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@hypershift-operator/controllers/nodepool/kubevirt/network.go`:
- Around line 225-228: The YAML serialization in this function is inconsistent
with GenerateNetworkMachineConfig: replace the manual bytes.Buffer +
api.YamlSerializer.Encode(mc, buf) pattern with the same helper call used
elsewhere — api.CompatibleYAMLEncode(mc, api.YamlSerializer) — so mc (the
MachineConfig) is encoded with the same formatting behavior; remove the
now-unused bytes import if it is no longer referenced after the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 645f2beb-0f31-4789-bd8f-fe3ec5e1aa9e
📒 Files selected for processing (3)
hypershift-operator/controllers/nodepool/config.gohypershift-operator/controllers/nodepool/kubevirt/network.gohypershift-operator/controllers/nodepool/kubevirt/network_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- hypershift-operator/controllers/nodepool/kubevirt/network_test.go
4cdc740 to
efa02a6
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (3)
hypershift-operator/controllers/nodepool/kubevirt/network_test.go (1)
13-26: Decode the generated object structurally instead of scanning YAML lines.This helper is tied to the current YAML/data-URL formatting, so harmless quoting or wrapping changes can fail the tests even when the
MachineConfigis still valid. Parsing the YAML intoMachineConfig, then decodingSpec.Config.Raw, would make these assertions much less brittle.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hypershift-operator/controllers/nodepool/kubevirt/network_test.go` around lines 13 - 26, The test helper decodeBase64Content is brittle because it scans YAML lines; replace its implementation to parse the YAML into a MachineConfig object and return the config payload from Spec.Config.Raw instead of string-scanning. Specifically, in decodeBase64Content: unmarshal the config YAML into the machineconfigv1.MachineConfig type (or a minimal struct with Spec.Config as a runtime.RawExtension), then return string(mc.Spec.Config.Raw) (or the Raw field) so the test reads the structured Spec.Config.Raw payload; add the necessary imports for the MachineConfig type and YAML unmarshalling.hypershift-operator/controllers/nodepool/kubevirt/network.go (1)
83-116: Extract the shared MachineConfig assembly path.Lines 83-116 and Lines 198-231 duplicate the same ignition serialization,
MachineConfigconstruction, label defaulting, and YAML encoding. Pulling that into one helper will keep the default and override branches from drifting.Also applies to: 198-231
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hypershift-operator/controllers/nodepool/kubevirt/network.go` around lines 83 - 116, Duplicate logic that serializes an ignition config, constructs a MachineConfig (including setting Name via kubevirtNetworkMachineConfigName), calls ignition.SetMachineConfigLabels, sets Spec.Config.Raw, APIVersion and Kind, and YAML-encodes it should be extracted into a single helper (e.g., buildKubevirtNetworkMachineConfig or encodeMachineConfigFromIgnition) that accepts the ignition.Config or the serialized bytes and returns the encoded YAML string (or error). Replace the duplicated blocks (the block using serializeIgnitionConfig, mcfgv1.MachineConfig, ignition.SetMachineConfigLabels, and api.CompatibleYAMLEncode) with calls to that helper in both places; ensure the helper preserves setting mc.Spec.Config.Raw = serializedConfig, mc.ObjectMeta.Name = kubevirtNetworkMachineConfigName, mc.APIVersion = mcfgv1.SchemeGroupVersion.String(), mc.Kind = "MachineConfig", and forwards errors from serializeIgnitionConfig and api.CompatibleYAMLEncode.hypershift-operator/controllers/nodepool/config.go (1)
162-167: This also changes the rollout hash for default-network KubeVirt pools.Because Line 121 and Line 129 hash
cg.mcoRawConfig, appending a platformMachineConfighere will force a rollout for every KubeVirt NodePool, not just the multus-primary ones. If that churn is expected, it would be good to call it out in the upgrade plan/release notes; otherwise this needs version gating around the paired MCO change.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hypershift-operator/controllers/nodepool/config.go` around lines 162 - 167, Appending platform-specific MachineConfigs unconditionally causes cg.mcoRawConfig-based rollout hashes (see uses at cg.mcoRawConfig) to change for all KubeVirt NodePools; restrict this so only multus-primary pools cause the append or add version gating around the paired MCO change. Modify the code around cg.getPlatformConfigs() and the call site where configs are appended so you either (a) early-return or skip calling cg.getPlatformConfigs()/appending platformConfigs unless the NodePool is the multus-primary type (check the NodePool spec/labels), or (b) guard the append behind a feature/version flag tied to the MCO rollout change, ensuring cg.mcoRawConfig is not mutated or included in the rollout hash for default-network pools.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@hypershift-operator/controllers/nodepool/kubevirt/network.go`:
- Around line 66-73: Add a nil guard at the start of exported helpers so they
return the neutral result instead of panicking; for example, in
GenerateNetworkMachineConfig check if nodePool == nil and immediately return "",
nil, and apply the same pattern to the other exported helper functions in this
file (the ones around lines 156-164 and 181-189) so they return their respective
neutral values (empty string or false) when nodePool is nil before dereferencing
nodePool.Spec.
---
Nitpick comments:
In `@hypershift-operator/controllers/nodepool/config.go`:
- Around line 162-167: Appending platform-specific MachineConfigs
unconditionally causes cg.mcoRawConfig-based rollout hashes (see uses at
cg.mcoRawConfig) to change for all KubeVirt NodePools; restrict this so only
multus-primary pools cause the append or add version gating around the paired
MCO change. Modify the code around cg.getPlatformConfigs() and the call site
where configs are appended so you either (a) early-return or skip calling
cg.getPlatformConfigs()/appending platformConfigs unless the NodePool is the
multus-primary type (check the NodePool spec/labels), or (b) guard the append
behind a feature/version flag tied to the MCO rollout change, ensuring
cg.mcoRawConfig is not mutated or included in the rollout hash for
default-network pools.
In `@hypershift-operator/controllers/nodepool/kubevirt/network_test.go`:
- Around line 13-26: The test helper decodeBase64Content is brittle because it
scans YAML lines; replace its implementation to parse the YAML into a
MachineConfig object and return the config payload from Spec.Config.Raw instead
of string-scanning. Specifically, in decodeBase64Content: unmarshal the config
YAML into the machineconfigv1.MachineConfig type (or a minimal struct with
Spec.Config as a runtime.RawExtension), then return string(mc.Spec.Config.Raw)
(or the Raw field) so the test reads the structured Spec.Config.Raw payload; add
the necessary imports for the MachineConfig type and YAML unmarshalling.
In `@hypershift-operator/controllers/nodepool/kubevirt/network.go`:
- Around line 83-116: Duplicate logic that serializes an ignition config,
constructs a MachineConfig (including setting Name via
kubevirtNetworkMachineConfigName), calls ignition.SetMachineConfigLabels, sets
Spec.Config.Raw, APIVersion and Kind, and YAML-encodes it should be extracted
into a single helper (e.g., buildKubevirtNetworkMachineConfig or
encodeMachineConfigFromIgnition) that accepts the ignition.Config or the
serialized bytes and returns the encoded YAML string (or error). Replace the
duplicated blocks (the block using serializeIgnitionConfig,
mcfgv1.MachineConfig, ignition.SetMachineConfigLabels, and
api.CompatibleYAMLEncode) with calls to that helper in both places; ensure the
helper preserves setting mc.Spec.Config.Raw = serializedConfig,
mc.ObjectMeta.Name = kubevirtNetworkMachineConfigName, mc.APIVersion =
mcfgv1.SchemeGroupVersion.String(), mc.Kind = "MachineConfig", and forwards
errors from serializeIgnitionConfig and api.CompatibleYAMLEncode.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 79da125b-65ec-4797-9598-6989684bd0d1
📒 Files selected for processing (3)
hypershift-operator/controllers/nodepool/config.gohypershift-operator/controllers/nodepool/kubevirt/network.gohypershift-operator/controllers/nodepool/kubevirt/network_test.go
f46757b to
f3f31bf
Compare
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: qinqon The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
d40cfff to
aa28671
Compare
b82d83d to
e103f0e
Compare
f1181d2 to
2d2cfd7
Compare
7b1c179 to
aebf849
Compare
aebf849 to
1947eb9
Compare
1947eb9 to
65c4090
Compare
…onally When a HyperShift KubeVirt NodePool uses the default pod network (AttachDefaultNetwork=true, the default), generate a MachineConfig with nmstate files that disable IPv6 autoconf and set the fe80::1 ARP proxy gateway route. This is required because OVN-Kubernetes assigns IPv6 via DHCPv6 stateful, not SLAAC. When the NodePool uses multus as the primary network (AttachDefaultNetwork=false), generate an override MachineConfig that replaces the MCO-rendered nmstate files with no-op content, allowing standard network auto-configuration (SLAAC) to work. This fixes dual-stack HCP on KubeVirt clusters using multus where nodes were configured with ipv6.method=dhcp instead of ipv6.method=auto, causing SLAAC to fail and nodes not getting IPv6 addresses. Signed-off-by: Enrique Llorente <[email protected]> Co-Authored-By: Claude Opus 4 (claude-opus-4-6) <[email protected]>
…net tests Extend KubeVirtAdvancedMultinetTest and KubeVirtMultinetTest to verify that nmstate network configuration is conditionally applied based on the AttachDefaultNetwork setting. When AttachDefaultNetwork=false (multus primary network), a privileged DaemonSet checks via nmstatectl that autoconf: false is NOT present, confirming the pod-network-specific nmstate config is not applied. When the default network is attached (normal cluster), the DaemonSet verifies that autoconf: false IS present, confirming the nmstate network configuration is correctly applied. Both tests reuse existing e2e infrastructure: CorrelateDaemonSet for node targeting and eventuallyDaemonSetRollsOut for readiness waiting. Commit-Message-Assisted-by: Claude (via Claude Code) Signed-off-by: Enrique Llorente <[email protected]> Co-Authored-By: Claude Opus 4 (claude-opus-4-6) <[email protected]>
…tack On KubeVirt, CNO requires worker nodes to probe the network MTU before deploying its operands (ovnkube-control-plane, network-node-identity, multus-admission-controller). Without at least one worker node, these deployments are never created, causing the CNO RolloutComplete condition to stay False and controlPlaneVersion to remain Partial indefinitely. This is the same issue OpenStack already works around by setting NodePoolReplicas=1. Apply the same workaround for KubeVirt. Co-Authored-By: Claude Opus 4 (claude-opus-4-6) <[email protected]> Signed-off-by: Enrique Llorente <[email protected]>
65c4090 to
2885536
Compare
|
/test e2e-kubevirt-aws-ovn |
|
/retest |
|
/test e2e-kubevirt-aws-ovn |
|
@qinqon: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
The PR's changes to Now I have enough evidence to produce the final report: Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryTwo pre-existing KubeVirt platform flaky tests failed due to infrastructure resource constraints — neither failure is related to the PR #8365 changes. The PR modifies KubeVirt nmstate network config generation ( Root CauseBoth failures stem from KubeVirt VM scheduling resource exhaustion on the management cluster, not from any code change in PR #8365: Failure 1 — TestNodePoolReplaceUpgrade: During a replace upgrade, the test creates a new NodePool ( Failure 2 — TestAdditionalTrustBundlePropagation: After updating the hosted cluster with an additional trust bundle, the NodePool entered Why these are unrelated to PR #8365:
Recommendations
Evidence
|
What this PR does / why we need it:
When a HyperShift KubeVirt NodePool uses the default pod network (
AttachDefaultNetwork=true, the default), the MCO templates unconditionally generate nmstate configuration files that disable IPv6 autoconf and set up thefe80::1ARP proxy gateway route. This is correct for the default pod network where OVN-Kubernetes assigns IPv6 via DHCPv6 stateful.However, when the NodePool uses multus as the primary network (
AttachDefaultNetwork=false), these configurations break SLAAC and prevent nodes from getting IPv6 addresses in dual-stack setups.This PR moves KubeVirt nmstate network configuration ownership from the MCO templates into the HyperShift nodepool controller, making it conditional:
01-kubevirt-network) with the nmstate files that disable IPv6 autoconf and configure the ARP proxy gateway (same behavior as current MCO templates).The override approach ensures the fix works immediately even before the corresponding MCO cleanup PR (which removes the unconditional templates) is merged.
Depends-On:
Which issue(s) this PR fixes:
Fixes https://issues.redhat.com/browse/OCPBUGS-77307
Special notes for your reviewer:
This PR is paired with an MCO PR that removes the unconditional kubevirt nmstate templates: the MCO PR depends on this one being merged first. Once both land:
GenerateNetworkOverrideMachineConfigfunction can be removed in a follow-upChecklist: