Skip to content

OCPBUGS-88325: fix(cpo) delete terminated MCD pods to retry in-place upgrades#8729

Open
PoornimaSingour wants to merge 2 commits into
openshift:release-4.22from
PoornimaSingour:OCPBUGS-88325
Open

OCPBUGS-88325: fix(cpo) delete terminated MCD pods to retry in-place upgrades#8729
PoornimaSingour wants to merge 2 commits into
openshift:release-4.22from
PoornimaSingour:OCPBUGS-88325

Conversation

@PoornimaSingour

Copy link
Copy Markdown
Contributor

Backport of #8434 to release-4.22.

Cherry-pick of 3e8737e and 9092cca.

PoornimaSingour and others added 2 commits June 12, 2026 11:33
…grades

When an in-place MCD upgrade pod terminates (Failed/Succeeded) but the
node still needs an upgrade, the controller now deletes the terminated
pod so a fresh one can be recreated on the next reconcile loop. A
periodic requeue (upgradeRequeueInterval = 30s) ensures the controller
re-evaluates nodes that still need upgrades rather than waiting for an
external event.

Additionally:
- Extract deleteUpgradePodIfExists helper to reduce duplication across
  reconcileUpgradePods and deleteUpgradeManifests
- Add test coverage for PodPending phase, multi-node mixed states,
  NotFound on Delete, RequeueAfter assertion, and Delete failure
  scenarios

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Replace the local deleteUpgradePodIfExists helper with the shared
k8sutil.DeleteIfNeeded utility to reduce duplication and improve
consistency across the codebase.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 12, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@PoornimaSingour: This pull request references Jira Issue OCPBUGS-88325, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected dependent Jira Issue OCPBUGS-84308 to target a version in 5.0.0, but it targets "5.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Backport of #8434 to release-4.22.

Cherry-pick of 3e8737e and 9092cca.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai

coderabbitai Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: e79bece9-2650-4047-b73d-dc92c9885720

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: PoornimaSingour
Once this PR has been reviewed and has the lgtm label, please assign jparrill for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release and removed do-not-merge/needs-area labels Jun 12, 2026
@openshift-ci

openshift-ci Bot commented Jun 12, 2026

Copy link
Copy Markdown
Contributor

@PoornimaSingour: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/images 0f1134e link true /test images
ci/prow/verify-deps 0f1134e link true /test verify-deps
ci/prow/security 0f1134e link true /test security

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@PoornimaSingour

Copy link
Copy Markdown
Contributor Author

/jira refresh

@openshift-ci-robot

Copy link
Copy Markdown

@PoornimaSingour: This pull request references Jira Issue OCPBUGS-88325, which is invalid:

  • release note text must be set and not match the template OR release note type must be set to "Release Note Not Required". For more information you can reference the OpenShift Bug Process.
  • expected dependent Jira Issue OCPBUGS-84308 to target a version in 5.0.0, but it targets "5.0" instead

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants