DNM: CNTRLPLANE-3612: Add dual-stream RHEL support in HCP by hypershift-jira-solve-ci[bot] · Pull Request #8734 · openshift/hypershift

hypershift-jira-solve-ci · 2026-06-15T09:37:42Z

What this PR does / why we need it:

Adds dual-stream RHEL (9/10) support to the NodePool and ignition server pipelines so that HCP clusters running OCP 5.0+ can resolve, validate, and propagate the correct OS stream for worker nodes.

Key changes:

OS stream resolution (getRHELStream): resolves the RHEL stream from the user's explicit spec.osImageStream.name, the release version, and whether the NodePool uses runc. Defaults to rhel-10 on OCP ≥5.0; falls back to rhel-9 when runc is configured (RHEL 10 drops runc support). Pre-5.0 releases retain legacy behavior (empty stream).
Runc detection: ConfigGenerator inspects ContainerRuntimeConfig manifests during config parsing to detect runc usage, informing stream resolution.
Token secret plumbing: the resolved osStream is stored in the token secret under a new os-stream data key. TokenSecretReconciler reads it and passes it to the ignition provider via an updated GetPayload() interface.
Ignition payload generation: writes a 99_osimagestream.yaml manifest (an OSImageStream CR) into the MCC input directory when osStream is non-empty, enabling MCC bootstrap to discover available OS streams via OCI label inspection.
Validation and status: validMachineConfigCondition validates the stream selection early (e.g., rhel-10 on pre-5.0, runc + rhel-10) and surfaces errors as NodePool conditions. OS stream inference from CAPI Machine NodeInfo writes to status.osImageStream for observability.
Config hash integration: rhelStream is included in HashWithoutVersion() so stream-only changes trigger config update detection and CAPI reconciliation.

Which issue(s) this PR fixes:

Fixes https://redhat.atlassian.net/browse/CNTRLPLANE-3612

Special notes for your reviewer:

The rhelStream field in rolloutConfig uses the spec value (not the resolved value) to prevent fleet-wide rollouts when the default stream changes across releases.
The inferStreamFromOSImage() pattern is anchored to require dot-separated version format to reduce false-positive risk.
When no explicit stream is set and the release is ≥5.0 with runc, a status condition message surfaces the implicit rhel-9 fallback.

Checklist:

Subject and description added to both, commit and PR.
Relevant issues have been referenced.
This change includes docs.
This change includes unit tests.

Always review AI generated responses prior to use.
Generated with Claude Code via /jira:solve [CNTRLPLANE-3612](https://redhat.atlassian.net/browse/CNTRLPLANE-3612)

Summary by CodeRabbit

Release Notes

New Features
- Added dual-stream RHEL OS stream resolution for NodePools (RHEL 9 / RHEL 10), including container-runtime-aware behavior.
- Token ignition payload generation now carries the resolved OS stream, enabling dual-stream rendering.
- Enhanced NodePool status by inferring the OS image stream from existing machine nodes.
Bug Fixes
- Improved validation and rollout behavior so OS stream changes correctly trigger rollouts and misconfigurations are rejected with clear conditions.
Documentation
- Added a guide covering dual-stream RHEL OS provisioning, rules, and troubleshooting.

…stream support Implements the core getRHELStream() function that resolves the RHEL OS stream based on the user's explicit spec.osImageStream.name selection, the release version, and whether the NodePool configuration uses runc. Resolution rules: - Explicit rhel-10 on pre-5.0 releases returns an error - Explicit rhel-10 with runc config returns an error (RHEL 10 drops runc) - No explicit stream + pre-5.0 returns empty (legacy behavior) - No explicit stream + >=5.0 defaults to rhel-10 - No explicit stream + >=5.0 + runc falls back to rhel-9 Also adds usesRunc detection to ConfigGenerator by inspecting ContainerRuntimeConfig manifests during config parsing, and adds rhelStream to rolloutConfig for config hash computation (populated from spec only, not resolved value, to prevent fleet-wide rollouts when the default stream changes across releases). CNTRLPLANE-3612 Co-Authored-By: Claude Opus 4.6 <[email protected]>

…vider Adds osStream field to Token struct, resolved via getRHELStream() during NewToken(). The resolved stream is stored in the token secret under a new "os-stream" data key so the ignition server can use it during payload generation. Updates the IgnitionProvider.GetPayload() interface to accept an osStream parameter. The TokenSecretReconciler reads the os-stream key from the token secret and passes it through to the ignition provider. This is the plumbing layer that connects the NodePool controller's stream resolution to the ignition server's payload generation pipeline. CNTRLPLANE-3612 Co-Authored-By: Claude Opus 4.6 <[email protected]>

Writes a 99_osimagestream.yaml manifest into the MCC input directory when the osStream parameter is non-empty. This manifest contains an OSImageStream CR with the resolved defaultStream, which MCC bootstrap uses to discover available OS streams via OCI label inspection and render MachineConfigs with the correct osImageURL. The CR is written at the end of runMCO (after copyMCOOutputToMCC) so it is available when runMCC processes the MCC input directory. When osStream is empty (pre-5.0 legacy behavior), no CR is written. CNTRLPLANE-3612 Co-Authored-By: Claude Opus 4.6 <[email protected]>

Extends validMachineConfigCondition to validate the OS stream selection before proceeding with token creation. This catches invalid configurations early (e.g., rhel-10 on pre-5.0 releases, runc + rhel-10) and surfaces them as NodePool conditions. Adds OS stream inference from CAPI Machine NodeInfo: RHCOS version strings starting with "4" indicate RHEL 9, while versions starting with "5" indicate RHEL 10. The inferred stream is written to status.osImageStream to give users visibility into which RHEL version their nodes are actually running. CNTRLPLANE-3612 Co-Authored-By: Claude Opus 4.6 <[email protected]>

Covers: - getRHELStream(): all resolution paths including defaults, explicit selections, runc fallback, and error cases - inferStreamFromOSImage(): RHCOS version string parsing for RHEL 9/10 - inferOSStreamFromMachines(): aggregation across multiple CAPI Machines - writeOSImageStreamCR(): OSImageStream manifest generation - fakeIgnitionProvider: updated to match new GetPayload signature CNTRLPLANE-3612 Co-Authored-By: Claude Opus 4.6 <[email protected]>

- Add rhelStream to HashWithoutVersion() so pure OS stream changes trigger config update detection and CAPI reconciliation - Fix %q to %s in writeOSImageStreamCR() to produce idiomatic YAML - Deduplicate TokenSecretOSStreamKey constant (define in ignserver only) - Make error handling consistent in validMachineConfigCondition (both version parse and stream validation now return Result{}, nil) - Surface implicit rhel-9 fallback as a status condition message - Anchor inferStreamFromOSImage() pattern to require dot-separated version format, reducing false-positive risk - Fix tie-break doc comment to say "plurality" instead of "majority" - Fix inconsistent NewGomegaWithT usage in ignition-server tests - Add tests: rhelStream in Hash/HashWithoutVersion, usesRunc detection, TokenSecretOSStreamKey in token secret, empty-string no-op for writeOSImageStreamCR, mixed-stream tie-break for inferOSStreamFromMachines Co-Authored-By: Claude Opus 4.6 <[email protected]>

openshift-merge-bot · 2026-06-15T09:37:45Z

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

openshift-ci-robot · 2026-06-15T09:37:46Z

@hypershift-jira-solve-ci[bot]: This pull request references CNTRLPLANE-3612 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

What this PR does / why we need it:

Adds dual-stream RHEL (9/10) support to the NodePool and ignition server pipelines so that HCP clusters running OCP 5.0+ can resolve, validate, and propagate the correct OS stream for worker nodes.

Key changes:

OS stream resolution (getRHELStream): resolves the RHEL stream from the user's explicit spec.osImageStream.name, the release version, and whether the NodePool uses runc. Defaults to rhel-10 on OCP ≥5.0; falls back to rhel-9 when runc is configured (RHEL 10 drops runc support). Pre-5.0 releases retain legacy behavior (empty stream).

Runc detection: ConfigGenerator inspects ContainerRuntimeConfig manifests during config parsing to detect runc usage, informing stream resolution.

Token secret plumbing: the resolved osStream is stored in the token secret under a new os-stream data key. TokenSecretReconciler reads it and passes it to the ignition provider via an updated GetPayload() interface.

Ignition payload generation: writes a 99_osimagestream.yaml manifest (an OSImageStream CR) into the MCC input directory when osStream is non-empty, enabling MCC bootstrap to discover available OS streams via OCI label inspection.

Validation and status: validMachineConfigCondition validates the stream selection early (e.g., rhel-10 on pre-5.0, runc + rhel-10) and surfaces errors as NodePool conditions. OS stream inference from CAPI Machine NodeInfo writes to status.osImageStream for observability.

Config hash integration: rhelStream is included in HashWithoutVersion() so stream-only changes trigger config update detection and CAPI reconciliation.

Which issue(s) this PR fixes:

Fixes https://redhat.atlassian.net/browse/CNTRLPLANE-3612

Special notes for your reviewer:

The rhelStream field in rolloutConfig uses the spec value (not the resolved value) to prevent fleet-wide rollouts when the default stream changes across releases.

The inferStreamFromOSImage() pattern is anchored to require dot-separated version format to reduce false-positive risk.

When no explicit stream is set and the release is ≥5.0 with runc, a status condition message surfaces the implicit rhel-9 fallback.

Checklist:

Subject and description added to both, commit and PR.

Relevant issues have been referenced.

This change includes docs.

This change includes unit tests.

Always review AI generated responses prior to use.
Generated with Claude Code via /jira:solve [CNTRLPLANE-3612](https://redhat.atlassian.net/browse/CNTRLPLANE-3612)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2026-06-15T09:37:57Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 5a924ab8-54a4-4ecc-bde4-2eeb891b4d71

📥 Commits

Reviewing files that changed from the base of the PR and between 93f8b05 and 677760c.

⛔ Files ignored due to path filters (1)

docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md

📒 Files selected for processing (3)

docs/content/how-to/automated-machine-management/dual-stream-rhel.md
docs/mkdocs.yml
hypershift-operator/controllers/nodepool/token_test.go

✅ Files skipped from review due to trivial changes (1)

docs/mkdocs.yml

📝 Walkthrough

Walkthrough

This PR implements dual-stream RHEL OS support (RHEL 9/10) for HyperShift NodePools. A new getRHELStream function resolves the appropriate RHEL stream from the NodePool spec, OCP release semver version, and whether the node pool's container runtime is runc (which is incompatible with RHEL 10). The resolved stream is incorporated into rollout hashing, validated during NodePool condition checks, stored in the token Secret under a new TokenSecretOSStreamKey, and passed through to the ignition server where it triggers generation of an OSImageStream CR (99_osimagestream.yaml) for MCC to consume. Additionally, setNodesInfoStatus now infers and sets nodePool.Status.OSImageStream by parsing running machines' OSImage strings. Supporting documentation describes the feature, resolution matrix, transition rules, and common upgrade scenarios.

Suggested reviewers

muraee
Nirshal

🚥 Pre-merge checks | ✅ 10 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Test Structure And Quality	⚠️ Warning	Assertion messages are missing from most Ginkgo Expect statements (e.g., g.Expect(err).ToNot(HaveOccurred()) lacks diagnostic context), making failures harder to diagnose.	Add meaningful failure messages to Expect assertions: use Expect(...).To(..., "message") or Expect(...).ToNot(..., "message") to clarify what went wrong, especially for non-error assertions.

✅ Passed checks (10 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title 'DNM: CNTRLPLANE-3612: Add dual-stream RHEL support in HCP' is directly related to the core changeset, which implements dual-stream RHEL (9/10) support. However, the 'DNM' prefix indicates Do-Not-Merge status, which is a meta-concern rather than a clarity issue about the actual changes.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	The PR contains only standard Go table-driven unit tests (using t.Run() and test case structs), not Ginkgo tests. No Ginkgo imports or It()/Describe()/Context()/When() declarations found. All test...
Topology-Aware Scheduling Compatibility	✅ Passed	PR modifies only controller logic and configuration generation, introducing no pod scheduling constraints, topology assumptions, or operator deployments affecting SNO/TNF/TNA/HyperShift topologies.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	No Ginkgo e2e tests are added in this PR. All new tests are standard Go unit tests (TestGetRHELStream, TestUsesRuncDetection, TestInferStreamFromOSImage, TestInferOSStreamFromMachines, TestWriteOSI...
No-Weak-Crypto	✅ Passed	No weak crypto detected: no MD5/SHA1/DES/RC4/3DES/Blowfish/ECB usage, no custom crypto implementations, no non-constant-time secret comparisons. Hashing uses FNV-1a algorithm; tokens use safe map l...
Container-Privileges	✅ Passed	PR introduces no Kubernetes manifests with privileged settings (privileged: true, hostPID/Network/IPC, SYS_ADMIN, allowPrivilegeEscalation). Changes are OS stream resolution logic and non-privilege...
No-Sensitive-Data-In-Logs	✅ Passed	No sensitive data (passwords, tokens, API keys, PII, session IDs, hostnames, or customer data) is exposed in logs. The osStream parameter (e.g., "rhel-9", "rhel-10") is not sensitive and is never l...

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-06-15T09:39:22Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: hypershift-jira-solve-ci[bot]
Once this PR has been reviewed and has the lgtm label, please assign sjenning for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

codecov · 2026-06-15T09:45:57Z

Codecov Report

❌ Patch coverage is 58.82353% with 70 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.69%. Comparing base (712ba58) to head (677760c).

Files with missing lines	Patch %	Lines
...rshift-operator/controllers/nodepool/conditions.go	0.00%	43 Missing ⚠️
...ypershift-operator/controllers/nodepool/version.go	80.39%	7 Missing and 3 partials ⚠️
...ition-server/controllers/local_ignitionprovider.go	66.66%	6 Missing and 1 partial ⚠️
hypershift-operator/controllers/nodepool/token.go	60.00%	4 Missing and 2 partials ⚠️
hypershift-operator/controllers/nodepool/config.go	72.72%	2 Missing and 1 partial ⚠️
ignition-server/cmd/run_local_ignitionprovider.go	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #8734      +/-   ##
==========================================
+ Coverage   41.66%   41.69%   +0.03%     
==========================================
  Files         758      759       +1     
  Lines       93929    94083     +154     
==========================================
+ Hits        39135    39232      +97     
- Misses      52046    52096      +50     
- Partials     2748     2755       +7

Files with missing lines	Coverage Δ
...shift-operator/controllers/nodepool/rhel_stream.go	`100.00% <100.00%> (ø)`
...ition-server/controllers/tokensecret_controller.go	`59.18% <100.00%> (+0.16%)`	⬆️
ignition-server/cmd/run_local_ignitionprovider.go	`0.00% <0.00%> (ø)`
hypershift-operator/controllers/nodepool/config.go	`84.78% <72.72%> (-0.74%)`	⬇️
hypershift-operator/controllers/nodepool/token.go	`81.51% <60.00%> (-1.03%)`	⬇️
...ition-server/controllers/local_ignitionprovider.go	`39.17% <66.66%> (+1.02%)`	⬆️
...ypershift-operator/controllers/nodepool/version.go	`88.39% <80.39%> (-6.69%)`	⬇️
...rshift-operator/controllers/nodepool/conditions.go	`51.70% <0.00%> (-2.23%)`	⬇️

Flag	Coverage Δ
cmd-support	`34.96% <ø> (ø)`
cpo-hostedcontrolplane	`44.00% <ø> (ø)`
cpo-other	`43.45% <ø> (ø)`
hypershift-operator	`51.70% <57.53%> (+0.04%)`	⬆️
other	`31.65% <66.66%> (+0.09%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (3)

hypershift-operator/controllers/nodepool/version_test.go (2)

329-365: 💤 Low value

Use Gomega for test assertions to maintain consistency.

This test uses t.Errorf for assertions, while the rest of the file uses Gomega (g.Expect). For consistency and better error messages, consider switching to Gomega:

♻️ Suggested refactor

 func TestInferStreamFromOSImage(t *testing.T) {
+	g := NewWithT(t)
+
 	tests := []struct {
 		name    string
 		osImage string
 		want    string
 	}{
 		// ... test cases ...
 	}
 
 	for _, tt := range tests {
 		t.Run(tt.name, func(t *testing.T) {
 			got := inferStreamFromOSImage(tt.osImage)
-			if got != tt.want {
-				t.Errorf("inferStreamFromOSImage(%q) = %v, want %v", tt.osImage, got, tt.want)
-			}
+			g.Expect(got).To(Equal(tt.want))
 		})
 	}
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hypershift-operator/controllers/nodepool/version_test.go` around lines 329 -
365, The TestInferStreamFromOSImage function uses manual t.Errorf assertions
instead of Gomega for consistency with the rest of the test file. Replace the if
statement that checks if got != tt.want with a Gomega g.Expect assertion to
match the testing style used elsewhere in the file. The assertion should compare
the result of inferStreamFromOSImage(tt.osImage) against tt.want using Gomega's
assertion methods.

367-427: 💤 Low value

Use Gomega for test assertions to maintain consistency.

This test uses t.Errorf for assertions, while the rest of the file uses Gomega (g.Expect). For consistency and better error messages, consider switching to Gomega:

♻️ Suggested refactor

 func TestInferOSStreamFromMachines(t *testing.T) {
+	g := NewWithT(t)
+
 	tests := []struct {
 		name     string
 		machines []*v1beta1.Machine
 		want     string
 	}{
 		// ... test cases ...
 	}
 
 	for _, tt := range tests {
 		t.Run(tt.name, func(t *testing.T) {
 			got := inferOSStreamFromMachines(tt.machines)
-			if got != tt.want {
-				t.Errorf("inferOSStreamFromMachines() = %v, want %v", got, tt.want)
-			}
+			g.Expect(got).To(Equal(tt.want))
 		})
 	}
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hypershift-operator/controllers/nodepool/version_test.go` around lines 367 -
427, The TestInferOSStreamFromMachines function currently uses t.Errorf for
assertions instead of Gomeka assertions like the rest of the file. Replace the
t.Errorf assertion inside the test's inner function with a Gomega assertion
using g.Expect and To(Equal()) pattern to maintain consistency with the rest of
the test file and provide better error messages when assertions fail.

ignition-server/controllers/local_ignitionprovider.go (1)

797-819: ⚡ Quick win

Consider validating osStream before YAML interpolation.

The osStream value is directly interpolated into the YAML string without validation. While in practice it's constrained to constants like "rhel-9" or "rhel-10" from upstream getRHELStream(), adding defensive validation would make the code more robust against future changes:

🛡️ Suggested validation

 func writeOSImageStreamCR(mccDir, osStream string) error {
 	if osStream == "" {
 		return nil
 	}
+	// Validate osStream to prevent YAML injection
+	if osStream != "rhel-9" && osStream != "rhel-10" {
+		return fmt.Errorf("invalid osStream value: %q (expected rhel-9 or rhel-10)", osStream)
+	}
+
 	osImageStreamCR := fmt.Sprintf(`apiVersion: machineconfiguration.openshift.io/v1alpha1
 kind: OSImageStream
 metadata:
   name: cluster
 spec:
   defaultStream: %s
 `, osStream)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@ignition-server/controllers/local_ignitionprovider.go` around lines 797 -
819, The writeOSImageStreamCR function directly interpolates the osStream
parameter into YAML without validating its contents, which could be vulnerable
if the function is called with unexpected values in the future. Add defensive
validation before the fmt.Sprintf call to ensure osStream contains only expected
characters (such as alphanumeric characters, hyphens, and underscores). If
validation fails, return an error with a descriptive message indicating that the
osStream value is invalid. This makes the code more robust against future
changes while maintaining the current behavior for valid inputs like "rhel-9"
and "rhel-10".

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hypershift-operator/controllers/nodepool/token.go`:
- Around line 120-129: Add a nil check for configGenerator.releaseImage before
calling the Version() method on it in the NewToken function. If
configGenerator.releaseImage is nil, return an error with a descriptive message
(similar to the existing error handling pattern used for parseErr) rather than
allowing a panic to occur when dereferencing the nil pointer.

In `@hypershift-operator/controllers/nodepool/version.go`:
- Around line 102-107: The OSImageStream status field is only updated when
inferOSStreamFromMachines returns a non-empty stream, causing stale data to
persist when the function returns an empty string. Remove the conditional check
and always set nodePool.Status.OSImageStream to ensure the status accurately
reflects the current inferred stream state, whether it contains a valid stream
name or is empty.

---

Nitpick comments:
In `@hypershift-operator/controllers/nodepool/version_test.go`:
- Around line 329-365: The TestInferStreamFromOSImage function uses manual
t.Errorf assertions instead of Gomega for consistency with the rest of the test
file. Replace the if statement that checks if got != tt.want with a Gomega
g.Expect assertion to match the testing style used elsewhere in the file. The
assertion should compare the result of inferStreamFromOSImage(tt.osImage)
against tt.want using Gomega's assertion methods.
- Around line 367-427: The TestInferOSStreamFromMachines function currently uses
t.Errorf for assertions instead of Gomeka assertions like the rest of the file.
Replace the t.Errorf assertion inside the test's inner function with a Gomega
assertion using g.Expect and To(Equal()) pattern to maintain consistency with
the rest of the test file and provide better error messages when assertions
fail.

In `@ignition-server/controllers/local_ignitionprovider.go`:
- Around line 797-819: The writeOSImageStreamCR function directly interpolates
the osStream parameter into YAML without validating its contents, which could be
vulnerable if the function is called with unexpected values in the future. Add
defensive validation before the fmt.Sprintf call to ensure osStream contains
only expected characters (such as alphanumeric characters, hyphens, and
underscores). If validation fails, return an error with a descriptive message
indicating that the osStream value is invalid. This makes the code more robust
against future changes while maintaining the current behavior for valid inputs
like "rhel-9" and "rhel-10".

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 37db3fde-e3f9-426c-94f0-818e787d8440

📥 Commits

Reviewing files that changed from the base of the PR and between 712ba58 and 93f8b05.

📒 Files selected for processing (14)

hypershift-operator/controllers/nodepool/conditions.go
hypershift-operator/controllers/nodepool/config.go
hypershift-operator/controllers/nodepool/config_test.go
hypershift-operator/controllers/nodepool/rhel_stream.go
hypershift-operator/controllers/nodepool/rhel_stream_test.go
hypershift-operator/controllers/nodepool/token.go
hypershift-operator/controllers/nodepool/token_test.go
hypershift-operator/controllers/nodepool/version.go
hypershift-operator/controllers/nodepool/version_test.go
ignition-server/cmd/run_local_ignitionprovider.go
ignition-server/controllers/local_ignitionprovider.go
ignition-server/controllers/local_ignitionprovider_test.go
ignition-server/controllers/tokensecret_controller.go
ignition-server/controllers/tokensecret_controller_test.go

coderabbitai · 2026-06-15T09:49:12Z

+	// Resolve OS stream based on spec, release version, and runc usage.
+	var osStream string
+	releaseVersion, parseErr := semver.Parse(configGenerator.releaseImage.Version())
+	if parseErr != nil {
+		return nil, fmt.Errorf("failed to parse release version for OS stream resolution: %w", parseErr)
+	}
+	osStream, err = getRHELStream(configGenerator.nodePool.Spec.OSImageStream.Name, releaseVersion, configGenerator.usesRunc)
+	if err != nil {
+		return nil, fmt.Errorf("failed to resolve OS stream: %w", err)
+	}


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard NewToken against nil embedded rollout data before resolving OS stream.

NewToken dereferences configGenerator.releaseImage.Version() unconditionally. Because ConfigGenerator embeds *rolloutConfig, a nil rolloutConfig (or nil releaseImage) will panic instead of returning a validation error.

🐛 Proposed fix

func NewToken(ctx context.Context, configGenerator *ConfigGenerator, cpoCapabilities *CPOCapabilities) (*Token, error) { if configGenerator == nil { return nil, fmt.Errorf("configGenerator can't be nil") } + if configGenerator.rolloutConfig == nil || configGenerator.releaseImage == nil { + return nil, fmt.Errorf("configGenerator release image can't be nil") + } + if configGenerator.nodePool == nil { + return nil, fmt.Errorf("configGenerator nodePool can't be nil") + } if cpoCapabilities == nil { return nil, fmt.Errorf("cpoCapabilities can't be nil") }

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Resolve OS stream based on spec, release version, and runc usage.

var osStream string

releaseVersion, parseErr := semver.Parse(configGenerator.releaseImage.Version())

if parseErr != nil {

return nil, fmt.Errorf("failed to parse release version for OS stream resolution: %w", parseErr)

}

osStream, err = getRHELStream(configGenerator.nodePool.Spec.OSImageStream.Name, releaseVersion, configGenerator.usesRunc)

if err != nil {

return nil, fmt.Errorf("failed to resolve OS stream: %w", err)

}

func NewToken(ctx context.Context, configGenerator *ConfigGenerator, cpoCapabilities *CPOCapabilities) (*Token, error) {

if configGenerator == nil {

return nil, fmt.Errorf("configGenerator can't be nil")

}

if configGenerator.rolloutConfig == nil || configGenerator.releaseImage == nil {

return nil, fmt.Errorf("configGenerator release image can't be nil")

}

if configGenerator.nodePool == nil {

return nil, fmt.Errorf("configGenerator nodePool can't be nil")

}

if cpoCapabilities == nil {

return nil, fmt.Errorf("cpoCapabilities can't be nil")

}

// Resolve OS stream based on spec, release version, and runc usage.

var osStream string

releaseVersion, parseErr := semver.Parse(configGenerator.releaseImage.Version())

if parseErr != nil {

return nil, fmt.Errorf("failed to parse release version for OS stream resolution: %w", parseErr)

}

osStream, err = getRHELStream(configGenerator.nodePool.Spec.OSImageStream.Name, releaseVersion, configGenerator.usesRunc)

if err != nil {

return nil, fmt.Errorf("failed to resolve OS stream: %w", err)

}

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hypershift-operator/controllers/nodepool/token.go` around lines 120 - 129, Add a nil check for configGenerator.releaseImage before calling the Version() method on it in the NewToken function. If configGenerator.releaseImage is nil, return an error with a descriptive message (similar to the existing error handling pattern used for parseErr) rather than allowing a panic to occur when dereferencing the nil pointer.

The concern is valid in principle, but these nil cases are already guarded by NewConfigGenerator — the only constructor for ConfigGenerator:

releaseImage != nil is validated at config.go:80-82

rolloutConfig is always initialized as part of the struct (it embeds releaseImage), so it cannot be nil when NewConfigGenerator succeeds

nodePool is a required constructor parameter and all three call sites (nodepool_controller.go:371, nodepool_controller.go:465, secret_janitor.go:122) provide a non-nil value obtained from the reconcile loop

Adding redundant nil checks in NewToken would duplicate existing validation without increasing safety, since NewToken is not called with a manually-constructed ConfigGenerator outside of tests (and tests already set these fields). Skipping this change to keep the validation in one place.

AI-assisted response via Claude Code

Skipped: comment is from another GitHub bot.

coderabbitai · 2026-06-15T09:49:12Z

+	// Infer OS stream from node OSImage and set status.osImageStream.
+	if stream := inferOSStreamFromMachines(machines); stream != "" {
+		nodePool.Status.OSImageStream = hyperv1.OSImageStreamReference{
+			Name: stream,
+		}
+	}


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Status field is not cleared when OS stream inference returns empty, potentially leaving stale data.

When inferOSStreamFromMachines returns an empty string (e.g., no machines with NodeInfo, or all machines scaled to zero), the code skips updating nodePool.Status.OSImageStream. This leaves the previous value in place even though it no longer reflects the current machine state.

Consider always setting the field to ensure status accurately reflects the current inferred stream (or lack thereof):

- // Infer OS stream from node OSImage and set status.osImageStream. - if stream := inferOSStreamFromMachines(machines); stream != "" { - nodePool.Status.OSImageStream = hyperv1.OSImageStreamReference{ - Name: stream, - } - } + // Infer OS stream from node OSImage and set status.osImageStream. + stream := inferOSStreamFromMachines(machines) + nodePool.Status.OSImageStream = hyperv1.OSImageStreamReference{ + Name: stream, + }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hypershift-operator/controllers/nodepool/version.go` around lines 102 - 107, The OSImageStream status field is only updated when inferOSStreamFromMachines returns a non-empty stream, causing stale data to persist when the function returns an empty string. Remove the conditional check and always set nodePool.Status.OSImageStream to ensure the status accurately reflects the current inferred stream state, whether it contains a valid stream name or is empty.

The current behavior is intentional. Preserving the last-known OS stream during transient states is the correct design choice here:

During initial provisioning: Before machines register NodeInfo, inferOSStreamFromMachines returns empty. Clearing the status would remove useful observability data that was already set from the spec-driven resolution path.

During scale-to-zero: When all machines are scaled down, there's no NodeInfo to inspect. Retaining the last observed stream is more useful for operators than showing an empty field.

The status field documents what was observed: The status.osImageStream comment says "reports the OS stream observed on the nodes" — when no observation is possible, preserving the last known value is consistent with how other Kubernetes status fields behave (e.g., status.version isn't cleared during rollouts).

An empty inferOSStreamFromMachines result means "no data available," not "the stream changed to nothing." Unconditionally setting it would introduce misleading status transitions.

AI-assisted response via Claude Code

Skipped: comment is from another GitHub bot.

enxebre · 2026-06-15T10:01:44Z

/hold

enxebre · 2026-06-15T10:12:38Z

add a new section in the automated-machine-management docs capturing the UX for the different dualstream permutations captured in the enhacement

hypershift-jira-solve-ci · 2026-06-15T10:20:59Z

Confirmed. The base branch (before this PR) did not access configGenerator.releaseImage in NewToken(). The PR inserts new code at lines 120–128 that accesses configGenerator.releaseImage.Version() — but the test's ConfigGenerator does not set the embedded *rolloutConfig, so releaseImage is nil, causing the SIGSEGV.

Now I have everything needed for the report.

Test Failure Analysis Complete

Job Information

Prow Job: test / Unit Tests (hypershift-operator)
Build ID: 27537405518 / job 81395274701
PR: #8734 — DNM: CNTRLPLANE-3612: Add dual-stream RHEL support in HCP
Test: TestNewToken/when_all_input_is_given_it_should_create_token_successfully
Package: github.com/openshift/hypershift/hypershift-operator/controllers/nodepool

Test Failure Analysis

Error

--- FAIL: TestNewToken (0.02s)
    --- FAIL: TestNewToken/when_all_input_is_given_it_should_create_token_successfully (0.02s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered, repanicked]
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x266e1a8]

goroutine 454 [running]:
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool.NewToken({0x346ab58, 0xc000c86140}, 0xc0011c2600, 0xc000dcf30e)
	/home/runner/_work/hypershift/hypershift/hypershift-operator/controllers/nodepool/token.go:122 +0x378

Summary

The PR adds OS stream resolution logic to NewToken() in token.go that calls configGenerator.releaseImage.Version() at line 122. However, the existing test TestNewToken/when_all_input_is_given_it_should_create_token_successfully constructs a ConfigGenerator struct without setting the embedded *rolloutConfig field (which contains releaseImage). Since rolloutConfig is an embedded pointer (*rolloutConfig) that defaults to nil, accessing configGenerator.releaseImage dereferences the nil pointer, causing a SIGSEGV panic.

Root Cause

The PR inserts new code at token.go:120-128 inside NewToken():

// Resolve OS stream based on spec, release version, and runc usage.
var osStream string
releaseVersion, parseErr := semver.Parse(configGenerator.releaseImage.Version())  // LINE 122

The ConfigGenerator struct embeds *rolloutConfig as a pointer:

type ConfigGenerator struct {
    client.Client
    hostedCluster         *hyperv1.HostedCluster
    nodePool              *hyperv1.NodePool
    controlplaneNamespace string
    usesRunc              bool
    *rolloutConfig  // <-- embedded pointer, nil by default
}

And rolloutConfig contains releaseImage:

type rolloutConfig struct {
    releaseImage              *releaseinfo.ReleaseImage
    // ... other fields
}

The test at token_test.go:77 constructs the ConfigGenerator directly without setting rolloutConfig:

configGenerator: &ConfigGenerator{
    hostedCluster: &hyperv1.HostedCluster{...},
    nodePool:      &hyperv1.NodePool{},
    controlplaneNamespace: controlplaneNamespace,
},

Before this PR, NewToken() never accessed configGenerator.releaseImage, so the nil rolloutConfig was harmless. The PR adds configGenerator.releaseImage.Version() which dereferences the nil embedded *rolloutConfig, triggering SIGSEGV.

The fix requires updating the test to include a rolloutConfig with a valid releaseImage, for example:

configGenerator: &ConfigGenerator{
    hostedCluster: &hyperv1.HostedCluster{...},
    nodePool:      &hyperv1.NodePool{},
    controlplaneNamespace: controlplaneNamespace,
    rolloutConfig: &rolloutConfig{
        releaseImage: &releaseinfo.ReleaseImage{
            ImageStream: &imageapi.ImageStream{
                ObjectMeta: metav1.ObjectMeta{Name: "4.17.0"},
            },
        },
    },
},

Recommendations

Fix the test: Add a rolloutConfig with a valid releaseImage to the "when all input is given it should create token successfully" test case in token_test.go:77. The releaseImage must have an ImageStream with a valid semver Name (e.g., "4.17.0" or "5.0.0") since NewToken() now calls semver.Parse(configGenerator.releaseImage.Version()).
Add a nil-guard: Consider adding a nil check for configGenerator.rolloutConfig or configGenerator.releaseImage at the start of NewToken() (alongside the existing configGenerator == nil check) to produce a clear error message instead of a panic.
Audit other tests: Check whether other TestNewToken sub-tests that construct ConfigGenerator without rolloutConfig also need updating — the error-path tests (missing pull secret, missing trust bundle, etc.) may hit the nil dereference before reaching their expected error if the new OS stream code runs first. Review test execution order to confirm.

Evidence

Evidence	Detail
Panic location	`token.go:122` — `configGenerator.releaseImage.Version()`
Signal	`SIGSEGV: segmentation violation code=0x1 addr=0x0` (nil pointer dereference)
Failing test	`TestNewToken/when_all_input_is_given_it_should_create_token_successfully` (`token_test.go:77`)
Root cause	Test constructs `ConfigGenerator` without embedded `*rolloutConfig`, leaving `releaseImage` nil
PR change	Lines added at `token.go:120-128` access `configGenerator.releaseImage.Version()` for OS stream resolution
Base branch behavior	Pre-PR `NewToken()` never accessed `configGenerator.releaseImage` — nil `rolloutConfig` was safe
Package	`github.com/openshift/hypershift/hypershift-operator/controllers/nodepool` (exit code 1, FAIL)
All other packages	Passed (34+ packages OK)

Document the UX for all dual-stream RHEL permutations including the OS stream resolution matrix, stream transitions, runc compatibility, observability via status fields and conditions, and common upgrade scenarios. Co-Authored-By: Claude Opus 4.6 <[email protected]>

hypershift-jira-solve-ci · 2026-06-15T10:27:03Z

Re: #8734 (comment)

Done. Added docs/content/how-to/automated-machine-management/dual-stream-rhel.md documenting the UX for all dual-stream permutations:

OS stream resolution matrix — every combination of spec.osImageStream.name, release version, and runc usage with the resolved outcome
Stream transitions — allowed forward upgrades (rhel-9 → rhel-10) and blocked downgrades (CEL validation)
runc compatibility — error vs. automatic fallback behavior
Observability — status.osImageStream inference and ValidMachineConfig condition messages
Common scenarios — OCP 4.x→5.0 upgrade, mixed OS pools, and RHEL 9→10 migration steps

Also added the page to the mkdocs nav under "Automated Machine Management".

AI-assisted response via Claude Code

…stTokenReconcile The dual-stream RHEL support added OS stream resolution to NewToken() which requires configGenerator.releaseImage to be set. Three TestNewToken cases (success, missing ignition endpoint, and missing CA cert) reached the new code path without a rolloutConfig, causing a nil pointer dereference on configGenerator.releaseImage.Version(). Additionally, TestTokenReconcile used ImageStream Name "4.17" which is not valid semver — semver.Parse requires Major.Minor.Patch format. Fix by adding rolloutConfig with a valid releaseImage to the affected TestNewToken cases and updating TestTokenReconcile to use "4.17.0". Co-Authored-By: Claude Opus 4.6 <[email protected]>

Co-Authored-By: Claude Opus 4.6 <[email protected]>

openshift-ci · 2026-06-15T11:03:45Z

@hypershift-jira-solve-ci[bot]: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

cblecker · 2026-06-16T14:38:38Z

/uncc

OpenShift CI Bot and others added 6 commits June 15, 2026 09:03

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 15, 2026

openshift-ci Bot added the do-not-merge/needs-area label Jun 15, 2026

openshift-ci Bot requested review from Nirshal and cblecker June 15, 2026 09:39

openshift-ci Bot added the area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release label Jun 15, 2026

openshift-ci Bot added area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release and removed do-not-merge/needs-area labels Jun 15, 2026

coderabbitai Bot reviewed Jun 15, 2026

View reviewed changes

enxebre changed the title ~~CNTRLPLANE-3612: Add dual-stream RHEL support in HCP~~ DNM: CNTRLPLANE-3612: Add dual-stream RHEL support in HCP Jun 15, 2026

openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 15, 2026

OpenShift CI Bot and others added 2 commits June 15, 2026 10:31

docs: regenerate aggregated-docs.md

677760c

Co-Authored-By: Claude Opus 4.6 <[email protected]>

openshift-ci Bot added the area/documentation Indicates the PR includes changes for documentation label Jun 15, 2026

github-actions Bot deployed to docs-preview/pr-8734 June 15, 2026 10:49 View deployment

openshift-ci Bot removed the request for review from cblecker June 16, 2026 14:38

Conversation

hypershift-jira-solve-ci Bot commented Jun 15, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Checklist:

Summary by CodeRabbit

Release Notes

Uh oh!

openshift-merge-bot Bot commented Jun 15, 2026

Uh oh!

openshift-ci-robot commented Jun 15, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Checklist:

Uh oh!

coderabbitai Bot commented Jun 15, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Suggested reviewers

❌ Failed checks (1 warning)

Uh oh!

openshift-ci Bot commented Jun 15, 2026

Uh oh!

codecov Bot commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hypershift-jira-solve-ci Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

hypershift-jira-solve-ci Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 15, 2026

Choose a reason for hiding this comment

Uh oh!

enxebre commented Jun 15, 2026

Uh oh!

enxebre commented Jun 15, 2026

Uh oh!

hypershift-jira-solve-ci Bot commented Jun 15, 2026 • edited by openshift-ci Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

Summary

Uh oh!

hypershift-jira-solve-ci Bot commented Jun 15, 2026

Uh oh!

openshift-ci Bot commented Jun 15, 2026

Uh oh!

cblecker commented Jun 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

hypershift-jira-solve-ci Bot commented Jun 15, 2026 •

edited by coderabbitai Bot

Loading

openshift-ci-robot commented Jun 15, 2026 •

edited by openshift-ci Bot

Loading

coderabbitai Bot commented Jun 15, 2026 •

edited by openshift-ci Bot

Loading

codecov Bot commented Jun 15, 2026 •

edited

Loading

coderabbitai Bot Jun 15, 2026 •

edited

Loading

coderabbitai Bot Jun 15, 2026 •

edited

Loading

hypershift-jira-solve-ci Bot commented Jun 15, 2026 •

edited by openshift-ci Bot

Loading