Skip to content

OCPBUGS-86798: spot MHC not created when autoRepair=true and ignition endpoint not reached#8645

Open
dpateriya wants to merge 1 commit into
openshift:mainfrom
dpateriya:fix/spot-mhc-autorepair-deadlock
Open

OCPBUGS-86798: spot MHC not created when autoRepair=true and ignition endpoint not reached#8645
dpateriya wants to merge 1 commit into
openshift:mainfrom
dpateriya:fix/spot-mhc-autorepair-deadlock

Conversation

@dpateriya

@dpateriya dpateriya commented May 31, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Fixes a deadlock where the spot-specific MachineHealthCheck is never created when autoRepair: true and ReachedIgnitionEndpoint is False
  • Moves the spot MHC reconciliation before the autoRepair/ignition gate so it is always created when spot instances are enabled
  • Adds dedicated test covering all combinations of spot/autoRepair/ignition state

Problem

The spot MHC block was placed after the autoRepair gate in CAPI.Reconcile(). When autoRepair: true and the ignition endpoint hasn't been reached yet, the function returns early via return nil at line 158, preventing the spot MHC from being created.

This creates a permanent deadlock when a spot instance fails to provision:

  1. Spot instance fails → Machine stuck in Pending with InstanceProvisionFailed
  2. No node joins → ReachedIgnitionEndpoint stays False forever
  3. autoRepair: true + ignition not reached → return nil → spot MHC never created
  4. Without spot MHC → no NodeStartupTimeout → no remediation → Machine stuck forever

Fix

Move the spot MHC reconciliation block before the autoRepair gate. The spot MHC is independent of autoRepair — it serves as a safety net for spot instance failures using maxUnhealthy: 100% and a 20-minute NodeStartupTimeout.

Reproduction

apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
spec:
  management:
    autoRepair: true
  platform:
    aws:
      placement:
        marketType: Spot

With spot capacity unavailable, the Machine stays Pending indefinitely and no <nodepool>-spot MHC is created.

Test plan

  • New test TestSpotMHCCreatedIndependentlyOfAutoRepairIgnitionGate with 4 cases:
    • spot + autoRepair=true + ignition NOT reached → spot MHC exists, regular MHC does not
    • spot + autoRepair=true + ignition reached → both MHCs exist
    • spot + autoRepair=false → spot MHC exists, regular MHC does not
    • no spot + autoRepair=true + ignition NOT reached → no MHCs
  • All existing TestCAPIReconcile tests pass (no regression)
  • All TestReconcileMachineHealthCheck tests pass
  • Full nodepool package tests pass
  • go vet clean

Summary by CodeRabbit

  • Bug Fixes

    • Spot instance health checks are now reconciled independently of auto-repair and ignition-gate state, ensuring spot-enabled node pools consistently have their own MHC created or removed as appropriate.
  • Tests

    • Added table-driven tests covering spot MHC creation/removal and regular MHC behavior across combinations of spot, auto-repair, and ignition-endpoint conditions.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@coderabbitai

coderabbitai Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: fc647cd4-e064-4801-bcc6-ffc4f7d41380

📥 Commits

Reviewing files that changed from the base of the PR and between a0f1b6d and 1a9eec3.

📒 Files selected for processing (2)
  • hypershift-operator/controllers/nodepool/capi.go
  • hypershift-operator/controllers/nodepool/capi_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • hypershift-operator/controllers/nodepool/capi.go
  • hypershift-operator/controllers/nodepool/capi_test.go

📝 Walkthrough

Walkthrough

This PR decouples spot instance health checks from the auto-repair gating logic. The Reconcile method now creates or updates a spot-specific MachineHealthCheck early based solely on whether spot instances are enabled, and deletes it if spot is disabled. The previous conditional block tied to auto-repair and ignition-endpoint status was removed. A new table-driven test verifies spot MHC behavior across combinations of spot enablement, auto-repair setting, and ignition endpoint readiness.

Suggested reviewers

  • muraee
  • sdminonne
🚥 Pre-merge checks | ✅ 10 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Test Structure And Quality ⚠️ Warning Test lacks assertion messages on 6 assertions (lines 3383, 3388, 3389, 3419, 3423, 3424) and has no cleanup code. Most assertions are missing meaningful failure diagnostic messages per requirement #4. Add diagnostic messages to all assertions; add t.Cleanup() or proper cleanup for fake client resources per requirements #2 and #4.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main fix: moving spot MHC reconciliation before the autoRepair/ignition gate to resolve the deadlock condition.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed The test uses standard Go table-driven tests (not Ginkgo). All test names are static, descriptive, and contain no dynamic information like pod names, timestamps, UUIDs, or node names.
Topology-Aware Scheduling Compatibility ✅ Passed PR reconciles Cluster API MachineHealthCheck resources (infrastructure Machines, not Pods). No pod-level scheduling constraints introduced; changes are topology-agnostic.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed The new test is a standard Go unit test using testing.T, not a Ginkgo e2e test, making this check inapplicable to the PR.
No-Weak-Crypto ✅ Passed No weak cryptography, custom crypto implementations, or insecure secret comparisons found in the PR's code changes to spot MachineHealthCheck reconciliation logic.
Container-Privileges ✅ Passed PR modifies only Go source code files (capi.go and capi_test.go) with no container/K8s manifests, privileged settings, or security context changes. Check is not applicable.
No-Sensitive-Data-In-Logs ✅ Passed Spot MHC logging adds only enum results and K8s object keys (namespace/name), consistent with existing patterns; no passwords, tokens, PII, or customer data exposed.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release and removed do-not-merge/needs-area labels May 31, 2026
@openshift-ci

openshift-ci Bot commented May 31, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: dpateriya
Once this PR has been reviewed and has the lgtm label, please assign cblecker for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot requested review from csrwng and jparrill May 31, 2026 18:46
@dpateriya dpateriya changed the title fix: spot MHC not created when autoRepair=true and ignition endpoint not reached OCPBUGS-86798: spot MHC not created when autoRepair=true and ignition endpoint not reached May 31, 2026
@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 31, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-86798, which is invalid:

  • expected the bug to target the "5.0.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

Summary

  • Fixes a deadlock where the spot-specific MachineHealthCheck is never created when autoRepair: true and ReachedIgnitionEndpoint is False
  • Moves the spot MHC reconciliation before the autoRepair/ignition gate so it is always created when spot instances are enabled
  • Adds dedicated test covering all combinations of spot/autoRepair/ignition state

Problem

The spot MHC block was placed after the autoRepair gate in CAPI.Reconcile(). When autoRepair: true and the ignition endpoint hasn't been reached yet, the function returns early via return nil at line 158, preventing the spot MHC from being created.

This creates a permanent deadlock when a spot instance fails to provision:

  1. Spot instance fails → Machine stuck in Pending with InstanceProvisionFailed
  2. No node joins → ReachedIgnitionEndpoint stays False forever
  3. autoRepair: true + ignition not reached → return nil → spot MHC never created
  4. Without spot MHC → no NodeStartupTimeout → no remediation → Machine stuck forever

Fix

Move the spot MHC reconciliation block before the autoRepair gate. The spot MHC is independent of autoRepair — it serves as a safety net for spot instance failures using maxUnhealthy: 100% and a 20-minute NodeStartupTimeout.

Reproduction

apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
spec:
 management:
   autoRepair: true
 platform:
   aws:
     placement:
       marketType: Spot

With spot capacity unavailable, the Machine stays Pending indefinitely and no <nodepool>-spot MHC is created.

Test plan

  • New test TestSpotMHCCreatedIndependentlyOfAutoRepairIgnitionGate with 4 cases:
  • spot + autoRepair=true + ignition NOT reached → spot MHC exists, regular MHC does not
  • spot + autoRepair=true + ignition reached → both MHCs exist
  • spot + autoRepair=false → spot MHC exists, regular MHC does not
  • no spot + autoRepair=true + ignition NOT reached → no MHCs
  • All existing TestCAPIReconcile tests pass (no regression)
  • All TestReconcileMachineHealthCheck tests pass
  • Full nodepool package tests pass
  • go vet clean

Summary by CodeRabbit

  • Bug Fixes

  • Spot instance monitoring now operates independently of auto-repair settings, ensuring consistent health checks for spot-enabled node pools.

  • Tests

  • Added comprehensive test coverage for spot instance health check behavior across different auto-repair configurations.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@dpateriya dpateriya force-pushed the fix/spot-mhc-autorepair-deadlock branch from be91e33 to a0f1b6d Compare May 31, 2026 18:48
@dpateriya

Copy link
Copy Markdown
Contributor Author

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 31, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-86798, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state New, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hypershift-operator/controllers/nodepool/capi_test.go`:
- Around line 3301-3334: The test case name strings in the table of cases (the
struct entries that include fields like name, autoRepair, ignitionReached,
spotEnabled, expectSpotMHC, expectRegularMHC, expectAutoRepairStatus) do not
follow the repo policy; update each name to the required format starting with
"When ..." and then "it should ..." (e.g. "When spot enabled and autoRepair true
and ignition NOT reached it should create spot MHC and not create regular MHC"),
preserving the rest of the case fields and semantics; change only the name
values in the test cases inside capi_test.go so they conform to the "When ... it
should ..." pattern.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: f9c051ee-dacb-4f42-a86c-f7ad5906603b

📥 Commits

Reviewing files that changed from the base of the PR and between ab1e63b and be91e33.

📒 Files selected for processing (2)
  • hypershift-operator/controllers/nodepool/capi.go
  • hypershift-operator/controllers/nodepool/capi_test.go

Comment thread hypershift-operator/controllers/nodepool/capi_test.go Outdated
@openshift-ci-robot

Copy link
Copy Markdown

@dpateriya: This pull request references Jira Issue OCPBUGS-86798, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (5.0.0) matches configured target version for branch (5.0.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)
Details

In response to this:

Summary

  • Fixes a deadlock where the spot-specific MachineHealthCheck is never created when autoRepair: true and ReachedIgnitionEndpoint is False
  • Moves the spot MHC reconciliation before the autoRepair/ignition gate so it is always created when spot instances are enabled
  • Adds dedicated test covering all combinations of spot/autoRepair/ignition state

Problem

The spot MHC block was placed after the autoRepair gate in CAPI.Reconcile(). When autoRepair: true and the ignition endpoint hasn't been reached yet, the function returns early via return nil at line 158, preventing the spot MHC from being created.

This creates a permanent deadlock when a spot instance fails to provision:

  1. Spot instance fails → Machine stuck in Pending with InstanceProvisionFailed
  2. No node joins → ReachedIgnitionEndpoint stays False forever
  3. autoRepair: true + ignition not reached → return nil → spot MHC never created
  4. Without spot MHC → no NodeStartupTimeout → no remediation → Machine stuck forever

Fix

Move the spot MHC reconciliation block before the autoRepair gate. The spot MHC is independent of autoRepair — it serves as a safety net for spot instance failures using maxUnhealthy: 100% and a 20-minute NodeStartupTimeout.

Reproduction

apiVersion: hypershift.openshift.io/v1beta1
kind: NodePool
spec:
 management:
   autoRepair: true
 platform:
   aws:
     placement:
       marketType: Spot

With spot capacity unavailable, the Machine stays Pending indefinitely and no <nodepool>-spot MHC is created.

Test plan

  • New test TestSpotMHCCreatedIndependentlyOfAutoRepairIgnitionGate with 4 cases:
  • spot + autoRepair=true + ignition NOT reached → spot MHC exists, regular MHC does not
  • spot + autoRepair=true + ignition reached → both MHCs exist
  • spot + autoRepair=false → spot MHC exists, regular MHC does not
  • no spot + autoRepair=true + ignition NOT reached → no MHCs
  • All existing TestCAPIReconcile tests pass (no regression)
  • All TestReconcileMachineHealthCheck tests pass
  • Full nodepool package tests pass
  • go vet clean

Summary by CodeRabbit

  • Bug Fixes

  • Spot instance health checks are now reconciled independently of auto-repair or ignition-gate state, ensuring spot-enabled node pools consistently have their own MHC created or removed as appropriate.

  • Tests

  • New table-driven tests cover spot MHC creation and regular MHC behavior across combinations of spot, auto-repair, and ignition endpoint conditions.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@hypershift-jira-solve-ci

Copy link
Copy Markdown

Now I have all the information needed. Let me compile the final report.

Test Failure Analysis Complete

Job Information

  • Prow Jobs: pull-ci-openshift-hypershift-main-okd-scos-images, pull-ci-openshift-hypershift-main-images, plus 3 GitHub Actions (Lint, Gitlint, Verify)
  • Build IDs: 2061180913437380608 (okd-scos-images), 2061180913391243264 (images)
  • PR: OCPBUGS-86798: spot MHC not created when autoRepair=true and ignition endpoint not reached #8645OCPBUGS-86798: spot MHC not created when autoRepair=true and ignition endpoint not reached
  • Files Changed: hypershift-operator/controllers/nodepool/capi.go, hypershift-operator/controllers/nodepool/capi_test.go

Test Failure Analysis

Error

# Lint (GitHub Actions)
hypershift-operator/controllers/nodepool/capi_test.go:3467:1: File is not properly formatted (gci)
hypershift-operator/controllers/nodepool/capi_test.go:3486:14: SA1019: spotMHC.Spec.MaxUnhealthy is deprecated

# Gitlint (GitHub Actions)
1: CT1 Title does not start with one of fix, feat, chore, docs, style, refactor, perf, test, revert, ci, build:
"OCPBUGS-86798: spot MHC not created when autoRepair=true and ignition endpoint not reached"

# Verify (GitHub Actions)
hypershift-operator/controllers/nodepool/capi_test.go: needs update

# okd-scos-images (Prow)
could not resolve source imagestream origin/scos-4.21 for release latest: imagestreams.image.openshift.io "scos-4.21" not found

# images (Prow)
could not resolve source imagestream ocp/5.0 for release latest: imagestreams.image.openshift.io "5.0" not found

Summary

Five CI jobs failed on PR #8645. Three are caused by the PR's code (Lint, Gitlint, Verify) and two are pre-existing CI infrastructure issues unrelated to the PR (okd-scos-images, images). The PR-caused failures stem from: (1) the new test file capi_test.go not being properly gofmt/gci-formatted, (2) using the deprecated spotMHC.Spec.MaxUnhealthy field flagged by staticcheck, and (3) the commit title not following the project's Conventional Commits format. The Prow image-build jobs fail because the CI cluster is missing the origin/scos-4.21 and ocp/5.0 ImageStreams — these fail identically on other PRs and are not caused by this change.

Root Cause

PR-caused failures (3 jobs — Lint, Gitlint, Verify):

  1. Go import formatting (gci) violation — The new test function TestSpotMHCCreatedIndependentlyOfAutoRepairIgnitionGate in capi_test.go has improperly ordered/grouped imports. The gci linter enforces a specific import grouping (stdlib → external → internal) and the file violates this at line 3467. This same formatting issue also causes the Verify job to fail, since make fmt reformats the file and git diff --exit-code detects uncommitted changes.

  2. Deprecated API usage (staticcheck SA1019) — At line 3486, the test asserts on spotMHC.Spec.MaxUnhealthy which is deprecated per cluster-api#10722. The linter flags this as SA1019.

  3. Commit title format violation (gitlint CT1) — The commit title "OCPBUGS-86798: spot MHC not created when autoRepair=true and ignition endpoint not reached" does not begin with a Conventional Commits prefix (fix, feat, chore, etc.). The project requires titles like fix: OCPBUGS-86798 spot MHC not created when autoRepair=true....

Infrastructure failures (2 jobs — okd-scos-images, images):

  1. Missing ImageStreams in CI cluster — The okd-scos-images job fails resolving origin/scos-4.21 and the images job fails resolving ocp/5.0. Both ImageStreams do not exist in the CI cluster. All image builds succeeded (hypershift, hypershift-operator, hypershift-cli, hypershift-tests); the failure occurs only at the [release-inputs:latest] step when assembling the release payload. These are CI configuration issues unrelated to this PR.
Recommendations

To fix the PR-caused failures:

  1. Run make fmt to fix the import ordering and Go formatting in capi_test.go. This will resolve both the Lint (gci) and Verify failures.

  2. Fix the deprecated field usage — Replace spotMHC.Spec.MaxUnhealthy with the non-deprecated alternative. If the production code in capi.go also sets MaxUnhealthy, consider using UnhealthyRange or the replacement field per the CAPI deprecation notice. Alternatively, if the project accepts it, add a //nolint:staticcheck directive on the test assertion line.

  3. Update the commit title to follow Conventional Commits format:

    fix: OCPBUGS-86798 spot MHC not created when autoRepair=true and ignition endpoint not reached
    

No action needed for the Prow image jobs — the okd-scos-images and images failures are infrastructure issues (missing ImageStreams scos-4.21 and ocp/5.0) and will resolve when the CI team provisions them. These are not blocking for the PR's correctness.

Evidence
Evidence Detail
Lint failure (gci) capi_test.go:3467:1: File is not properly formatted (gci) — import grouping violation in new test code
Lint failure (staticcheck) capi_test.go:3486:14: SA1019: spotMHC.Spec.MaxUnhealthy is deprecated — uses deprecated CAPI field
Gitlint failure CT1 Title does not start with one of fix, feat, chore, docs, style, refactor, perf, test, revert, ci, build
Verify failure capi_test.go: needs updatemake fmt produces diff, file not properly formatted before commit
okd-scos-images failure imagestreams.image.openshift.io "scos-4.21" not found — CI infrastructure, not PR-related
images failure imagestreams.image.openshift.io "5.0" not found — CI infrastructure, not PR-related
Changed files capi.go (moved spot MHC reconciliation before autoRepair gate), capi_test.go (added 233-line test)
All image builds Succeeded in both Prow jobs — failure is only at release-input resolution step

@jparrill jparrill left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped some comments. Thanks!


Note (not in the diff): While reviewing the spot MHC lifecycle I noticed that delete() in nodepool_controller.go:527 only cleans up the regular MHC (capi.machineHealthCheck()). The spot MHC (capi.spotMachineHealthCheck()) is not deleted — and there are no owner references on it either, so it gets orphaned in the control plane namespace when a spot-enabled NodePool is deleted.

It's harmless in practice (the namespace is GC'd with the HostedCluster), but could you file a follow-up to add the cleanup there?

Comment thread hypershift-operator/controllers/nodepool/capi.go
Comment thread hypershift-operator/controllers/nodepool/capi_test.go
Comment thread hypershift-operator/controllers/nodepool/capi_test.go Outdated
Comment thread hypershift-operator/controllers/nodepool/capi_test.go
… endpoint not reached

OCPBUGS-86798

The spot-specific MachineHealthCheck was blocked by the autoRepair/ignition
gate because it was reconciled after the gate's early return. Move spot MHC
reconciliation before the gate so it always runs when spot is enabled,
regardless of autoRepair status or ignition endpoint reachability.

Co-authored-by: Cursor <[email protected]>
@dpateriya dpateriya force-pushed the fix/spot-mhc-autorepair-deadlock branch from 1a9eec3 to 3b7b333 Compare June 15, 2026 11:24
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 15, 2026
@dpateriya

Copy link
Copy Markdown
Contributor Author

Addressed all feedback:

  • Rebased on latest main and resolved conflicts
  • Fixed gitlint CT1: commit title now uses conventional format (fix(nodepool): ...) with OCPBUGS-86798 in the body
  • Fixed gci import ordering: ran make fmt
  • Fixed SA1019 deprecated MaxUnhealthy: added //nolint:staticcheck since production code also uses the deprecated field
  • Added 5th test case: spot enabled via Placement.MarketType: Spot (API spec path) instead of annotation
  • Changed expectAutoRepairStatus from string to *corev1.ConditionStatus (nil = don't check)
  • Added comment on autoRepair=false test noting that ignitionReached is irrelevant

Follow-up issues filed:

  1. fix(nodepool): scope autoRepair/ignition gate to regular MHC only instead of returning from entire CAPI Reconcile #8735 — Refactor return nil gate to only affect regular MHC creation
  2. fix(nodepool): add spot MHC cleanup on NodePool deletion #8736 — Add spot MHC cleanup on NodePool deletion

@codecov

codecov Bot commented Jun 15, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 42.10526% with 11 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.66%. Comparing base (712ba58) to head (3b7b333).

Files with missing lines Patch % Lines
hypershift-operator/controllers/nodepool/capi.go 42.10% 7 Missing and 4 partials ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main    #8645   +/-   ##
=======================================
  Coverage   41.66%   41.66%           
=======================================
  Files         758      758           
  Lines       93929    93928    -1     
=======================================
+ Hits        39135    39138    +3     
+ Misses      52046    52043    -3     
+ Partials     2748     2747    -1     
Files with missing lines Coverage Δ
hypershift-operator/controllers/nodepool/capi.go 72.12% <42.10%> (+0.34%) ⬆️
Flag Coverage Δ
cmd-support 34.96% <ø> (ø)
cpo-hostedcontrolplane 44.00% <ø> (ø)
cpo-other 43.45% <ø> (ø)
hypershift-operator 51.67% <42.10%> (+0.01%) ⬆️
other 31.56% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@openshift-ci

openshift-ci Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

@dpateriya: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants