OCPBUGS-88531: Restart cloud-network-config-controller on restart-date annotation#8738
OCPBUGS-88531: Restart cloud-network-config-controller on restart-date annotation#8738bryan-cox wants to merge 2 commits into
Conversation
📝 WalkthroughWalkthroughThe change extends the restart annotation reconciliation in 🚥 Pre-merge checks | ✅ 10 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@bryan-cox: This pull request references Jira Issue OCPBUGS-88531, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #8738 +/- ##
=======================================
Coverage 41.66% 41.67%
=======================================
Files 758 758
Lines 93929 93939 +10
=======================================
+ Hits 39135 39147 +12
+ Misses 52046 52043 -3
- Partials 2748 2749 +1
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
…otation cloud-network-config-controller is a CNO operand deployed on cloud platforms (AWS/Azure/GCP/OpenStack) that uses rotatable cloud API credentials and a kubeconfig. It was omitted from the restart-date annotation handling in cleanupClusterNetworkOperatorResources, leaving it running with stale credentials after rotation. This follows the same pattern as the ovnkube-control-plane fix in 9e1e73e. SetRestartAnnotationAndPatch returns nil for not-found deployments, so no platform-specific conditional is needed. Also fixes a pre-existing value-copy bug in SetRestartAnnotationAndPatch where ObjectMeta was copied by value, causing the annotation assignment to be lost when pod template annotations were nil. Co-Authored-By: Claude Opus 4.6 <[email protected]>
Add cloud-network-config-controller, multus-admission-controller, network-node-identity, and ovnkube-control-plane to the documented list of components restarted by the restart-date annotation. These were already restarted but missing from the documentation. Co-Authored-By: Claude Opus 4.6 <[email protected]>
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Skipping CI for Draft Pull Request. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: bryan-cox The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
control-plane-operator/controllers/hostedcontrolplane/v2/cno/component_test.go (1)
42-53: ⚡ Quick winAssert existing annotations are preserved, not just the restart key.
The “existing annotations” case currently only checks the new restart annotation at Line 97. Add an assertion that
existing-keyis still present to protect against map replacement regressions.Suggested assertion addition
g.Expect(err).ToNot(HaveOccurred()) g.Expect(updated.Spec.Template.ObjectMeta.Annotations).To(HaveKeyWithValue(hyperv1.RestartDateAnnotation, tt.restartAnnotation)) + if tt.name == "When deployment exists with existing annotations it should set the restart annotation" { + g.Expect(updated.Spec.Template.ObjectMeta.Annotations).To(HaveKeyWithValue("existing-key", "existing-value")) + } }As per coding guidelines, “Unit test any code changes and additions, and include e2e tests when changes impact consumer behavior.”
Also applies to: 97-97
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@control-plane-operator/controllers/hostedcontrolplane/v2/cno/component_test.go` around lines 42 - 53, The test case "When deployment exists with existing annotations it should set the restart annotation" only verifies that the restart annotation is set (at line 97), but does not verify that the pre-existing annotations are preserved. Add an assertion after the restart annotation check to confirm that the existing-key annotation with value existing-value is still present in the deployment's template annotations. This protects against regressions where the annotation map might be replaced instead of updated.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@control-plane-operator/controllers/hostedcontrolplane/v2/cno/component_test.go`:
- Line 21: The error returned by appsv1.AddToScheme(scheme) on line 21 is being
discarded with a blank identifier, which can mask failures during test setup.
Instead of discarding the error, check if AddToScheme returns a non-nil error
and fail the test immediately using the appropriate test failure method (such as
t.Fatalf or Expect/Require from your testing library) to provide clear
diagnostics if scheme registration fails during test initialization.
---
Nitpick comments:
In
`@control-plane-operator/controllers/hostedcontrolplane/v2/cno/component_test.go`:
- Around line 42-53: The test case "When deployment exists with existing
annotations it should set the restart annotation" only verifies that the restart
annotation is set (at line 97), but does not verify that the pre-existing
annotations are preserved. Add an assertion after the restart annotation check
to confirm that the existing-key annotation with value existing-value is still
present in the deployment's template annotations. This protects against
regressions where the annotation map might be replaced instead of updated.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 6cfedae9-454e-40f0-b68d-10847e8bc145
⛔ Files ignored due to path filters (1)
docs/content/reference/aggregated-docs.mdis excluded by!docs/content/reference/aggregated-docs.md
📒 Files selected for processing (5)
control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.gocontrol-plane-operator/controllers/hostedcontrolplane/manifests/cno.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/cno/component.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/cno/component_test.godocs/content/how-to/restart-control-plane-components.md
✅ Files skipped from review due to trivial changes (1)
- docs/content/how-to/restart-control-plane-components.md
🚧 Files skipped from review as they are similar to previous changes (3)
- control-plane-operator/controllers/hostedcontrolplane/v2/cno/component.go
- control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
- control-plane-operator/controllers/hostedcontrolplane/manifests/cno.go
|
|
||
| func TestSetRestartAnnotationAndPatch(t *testing.T) { | ||
| scheme := runtime.NewScheme() | ||
| _ = appsv1.AddToScheme(scheme) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Find and examine the target test file
fd -a 'component_test.go' control-plane-operator/controllers/hostedcontrolplane/v2/cno/Repository: openshift/hypershift
Length of output: 162
🏁 Script executed:
# Read the specific file to see the context around line 21
wc -l control-plane-operator/controllers/hostedcontrolplane/v2/cno/component_test.go && \
head -30 control-plane-operator/controllers/hostedcontrolplane/v2/cno/component_test.goRepository: openshift/hypershift
Length of output: 816
🏁 Script executed:
# Check if there are other AddToScheme calls in similar test files and how they're handled
rg -A 2 -B 2 "AddToScheme" control-plane-operator/controllers/hostedcontrolplane/v2/ --type goRepository: openshift/hypershift
Length of output: 17118
Handle AddToScheme error in test setup instead of discarding it.
Line 21 discards the error return from appsv1.AddToScheme(scheme). If scheme registration fails, subsequent test assertions may fail for the wrong reason, making debugging harder.
Suggested fix
scheme := runtime.NewScheme()
- _ = appsv1.AddToScheme(scheme)
+ if err := appsv1.AddToScheme(scheme); err != nil {
+ t.Fatalf("failed to add apps/v1 to scheme: %v", err)
+ }📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| _ = appsv1.AddToScheme(scheme) | |
| scheme := runtime.NewScheme() | |
| if err := appsv1.AddToScheme(scheme); err != nil { | |
| t.Fatalf("failed to add apps/v1 to scheme: %v", err) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In
`@control-plane-operator/controllers/hostedcontrolplane/v2/cno/component_test.go`
at line 21, The error returned by appsv1.AddToScheme(scheme) on line 21 is being
discarded with a blank identifier, which can mask failures during test setup.
Instead of discarding the error, check if AddToScheme returns a non-nil error
and fail the test immediately using the appropriate test failure method (such as
t.Fatalf or Expect/Require from your testing library) to provide clear
diagnostics if scheme registration fails during test initialization.
Source: Coding guidelines
|
Closing in favor of the CNO fix: openshift/cluster-network-operator#3030 CNO is the correct owner for restarting its operands when the restart-date annotation changes. |
|
@bryan-cox: This pull request references Jira Issue OCPBUGS-88531. The bug has been updated to no longer refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Just a note. I'm adding this test and the fix that is very similar to what was in this PR: #8733 |
|
I think some parts of this PR would still be useful:
|
What this PR does / why we need it:
Note: The primary fix for this bug is in CNO: openshift/cluster-network-operator#3030
This PR is a companion fix in hypershift that:
Fixes a pre-existing value-copy bug in
SetRestartAnnotationAndPatch—podMeta := patch.Spec.Template.ObjectMetacreates a value copy, so whenpodMeta.Annotationsis nil, the new map is lost. Fixed by working directly onpatch.Spec.Template.ObjectMeta.Annotations.Adds cloud-network-config-controller restart as a stopgap in CPO's
cleanupClusterNetworkOperatorResources— this follows the existing pattern for multus-admission-controller, network-node-identity, and ovnkube-control-plane. Once the CNO fix lands, all CPO restart calls for CNO operands can be removed (tracked by the existing TODO comment at line 2026).Documents the four CNO-managed components that were missing from the restart documentation.
Which issue(s) this PR fixes:
Fixes https://issues.redhat.com/browse/OCPBUGS-88531
Special notes for your reviewer:
The architecturally correct fix is in CNO (PR #3030), where CNO reads the
restart-dateannotation from the HCP CR and injects it into all rendered operand pod templates. This hypershift PR provides the stopgap and bug fix.The
cleanupClusterNetworkOperatorResourcesfunction in CPO has a long-standing TODO asking "why is this not done in CNO?" — the CNO PR addresses that for the restart-date case.Checklist:
🤖 Generated with Claude Code via
/jira:solve OCPBUGS-88531