CNTRLPLANE-3026: Decouple AWS AMI resolution for dual-stream support#8699
CNTRLPLANE-3026: Decouple AWS AMI resolution for dual-stream support#8699jparrill wants to merge 1 commit into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Skipping CI for Draft Pull Request. |
|
@jparrill: This pull request references CNTRLPLANE-3026 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR adds multi-stream RHEL CoreOS support to HyperShift's NodePool AMI resolution. The ReleaseImage type gains an OSStreams field and StreamForName method to resolve stream metadata by name. Deserialization now handles both single-stream (4.x) and multi-stream (5.x) ConfigMap payloads with automatic fallback to rhel-9 when only multi-stream data is present. A new GetRHELStream helper selects between rhel-9 and rhel-10 based on OCP version and runc usage, enforcing version constraints (OCP >= 5 for rhel-10, incompatible with runc). The defaultNodePoolAMI function is refactored to accept resolved *stream.Stream instead of *ReleaseImage. AWS and Karpenter AMI resolution paths now call StreamForName before AMI computation. Provider implementations (registry client and mirror decorator) are updated to populate and propagate OSStreams. A comprehensive 5.0 fixture with both single-stream and multi-stream payloads validates deserialization and AMI selection across architectures and providers. Possibly related PRs
🚥 Pre-merge checks | ✅ 11✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jparrill The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8699 +/- ##
==========================================
+ Coverage 41.79% 41.83% +0.03%
==========================================
Files 759 759
Lines 94037 94043 +6
==========================================
+ Hits 39304 39342 +38
+ Misses 51983 51946 -37
- Partials 2750 2755 +5
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
/test e2e-aws |
|
/test verify |
Test Resultse2e-aws
|
b59120e to
ee50909
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
hypershift-operator/controllers/nodepool/aws.go (1)
121-125:⚠️ Potential issue | 🟠 Major | ⚡ Quick winGuard exported
ResolveAWSAMIinputs to prevent nil-pointer panics.
ResolveAWSAMIdereferenceshostedCluster,nodePool, andreleaseImagewithout validating them first. Since this helper is now exported, bad caller input can crash reconciliation instead of returning a typed error.Suggested fix
func ResolveAWSAMI(hostedCluster *hyperv1.HostedCluster, nodePool *hyperv1.NodePool, releaseImage *releaseinfo.ReleaseImage) (string, error) { + if hostedCluster == nil || hostedCluster.Spec.Platform.AWS == nil { + return "", fmt.Errorf("hostedCluster or hostedCluster.Spec.Platform.AWS is nil") + } + if nodePool == nil || nodePool.Spec.Platform.AWS == nil { + return "", fmt.Errorf("nodePool or nodePool.Spec.Platform.AWS is nil") + } + if releaseImage == nil { + return "", fmt.Errorf("release image is nil") + } + // TODO: Should the region be included in the NodePool platform information? region := hostedCluster.Spec.Platform.AWS.RegionAlso applies to: 137-140
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@hypershift-operator/controllers/nodepool/aws.go` around lines 121 - 125, ResolveAWSAMI currently dereferences hostedCluster, nodePool, and releaseImage unguarded; add nil checks at the start of ResolveAWSAMI to validate those inputs and return a clear error instead of panicking (e.g., if hostedCluster == nil || hostedCluster.Spec.Platform.AWS == nil/empty region, or nodePool == nil, or releaseImage == nil return formatted errors). Also guard accesses to nodePool.Spec.Platform.AWS and nodePool.Spec.Arch before using them and ensure any downstream uses (the code around the later AMI resolution logic referenced in lines ~137-140) similarly check for nil platform fields and return typed errors. Ensure the returned error messages include which parameter was nil to aid callers and reconciliation error handling.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@hypershift-operator/controllers/nodepool/aws.go`:
- Around line 121-125: ResolveAWSAMI currently dereferences hostedCluster,
nodePool, and releaseImage unguarded; add nil checks at the start of
ResolveAWSAMI to validate those inputs and return a clear error instead of
panicking (e.g., if hostedCluster == nil || hostedCluster.Spec.Platform.AWS ==
nil/empty region, or nodePool == nil, or releaseImage == nil return formatted
errors). Also guard accesses to nodePool.Spec.Platform.AWS and
nodePool.Spec.Arch before using them and ensure any downstream uses (the code
around the later AMI resolution logic referenced in lines ~137-140) similarly
check for nil platform fields and return typed errors. Ensure the returned error
messages include which parameter was nil to aid callers and reconciliation error
handling.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: dcad894a-d232-448c-a323-5c53e80f5f67
📒 Files selected for processing (16)
hypershift-operator/controllers/nodepool/aws.gohypershift-operator/controllers/nodepool/aws_test.gohypershift-operator/controllers/nodepool/nodepool_controller.gohypershift-operator/controllers/nodepool/nodepool_controller_test.gohypershift-operator/controllers/nodepool/stream.gohypershift-operator/controllers/nodepool/stream_test.gohypershift-operator/controllers/nodepool/token.gosupport/releaseinfo/deserialize.gosupport/releaseinfo/deserialize_test.gosupport/releaseinfo/fixtures/5.0-installer-coreos-bootimages.yamlsupport/releaseinfo/fixtures/fixtures.gosupport/releaseinfo/registry_mirror_provider.gosupport/releaseinfo/registryclient_provider.gosupport/releaseinfo/releaseinfo.gosupport/releaseinfo/releaseinfo_test.gotest/integration/osstreams/osstreams_test.go
🚧 Files skipped from review as they are similar to previous changes (10)
- support/releaseinfo/fixtures/fixtures.go
- support/releaseinfo/registry_mirror_provider.go
- hypershift-operator/controllers/nodepool/stream.go
- hypershift-operator/controllers/nodepool/token.go
- support/releaseinfo/registryclient_provider.go
- hypershift-operator/controllers/nodepool/stream_test.go
- support/releaseinfo/releaseinfo_test.go
- hypershift-operator/controllers/nodepool/nodepool_controller.go
- test/integration/osstreams/osstreams_test.go
- support/releaseinfo/deserialize.go
| "github.com/openshift/hypershift/support/releaseinfo/fixtures" | ||
| ) | ||
|
|
||
| func TestOSStreamsEndToEnd(t *testing.T) { |
There was a problem hiding this comment.
please let's don't use test/integration. It's a legacy folder.
Also let's name unit tests after the function being tested
There was a problem hiding this comment.
Agreed, removed the integration test entirely and dropped the commit. Same as #8669.
| if err != nil { | ||
| return "", fmt.Errorf("couldn't resolve stream metadata: %w", err) | ||
| } | ||
| ami, err := defaultNodePoolAMI(region, arch, streamMeta) |
There was a problem hiding this comment.
what's the value of changing this sigature? seems cosmetic
There was a problem hiding this comment.
It's not cosmetic — it's the core decoupling this PR does. Today defaultNodePoolAMI receives *releaseinfo.ReleaseImage and can only resolve from the default stream. By changing it to accept *stream.Stream, we enable the caller to pass any stream — which is what CNTRLPLANE-3553 needs when wiring spec.osImageStream into the resolution path.
Right now all callers pass StreamForName("") (backward compatible), but when the NodePool API field lands, they'll pass StreamForName(nodePool.Spec.OSImageStream.Name) and defaultNodePoolAMI doesn't need to change at all — only the caller does.
| } | ||
|
|
||
| func resolveAWSAMI(hostedCluster *hyperv1.HostedCluster, nodePool *hyperv1.NodePool, releaseImage *releaseinfo.ReleaseImage) (string, error) { | ||
| func ResolveAWSAMI(hostedCluster *hyperv1.HostedCluster, nodePool *hyperv1.NodePool, releaseImage *releaseinfo.ReleaseImage) (string, error) { |
There was a problem hiding this comment.
why are we changing the scope of these funcs? if it's to invoke from another test package that's not a good reason
There was a problem hiding this comment.
You're right — the only reason was to call it from the integration test package. Since we removed the integration test (per your feedback on #8669), there's no reason to export it anymore. Reverted to resolveAWSAMI.
5e7f425 to
b386e97
Compare
|
lgtm |
ee4628b to
079d913
Compare
Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryBoth codecov checks failed because the PR introduces new executable code paths in Root CauseThe PR refactors The unit tests for
The Recommendations
Evidence
|
| } | ||
| // Default behavior for Linux/RHCOS AMIs | ||
| ami, err := defaultNodePoolAMI(region, arch, releaseImage) | ||
| streamMeta, err := releaseImage.StreamForName("") |
There was a problem hiding this comment.
why is this value not coming from GetRHELStream?
There was a problem hiding this comment.
GetRHELStream in resolveAWSAMI — GetRHELStream is the policy layer (deciding which stream), this PR is the mechanism layer (making the code able to use any stream). Wiring GetRHELStream here means also threading releaseVersion and usesRunc through the call chain, which is exactly what CNTRLPLANE-3553 does when it adds the osImageStream API field. Pulling that wiring into this PR conflates two concerns. For now, "" gives us the default stream — same behavior as before, just through the new decoupled path.
There was a problem hiding this comment.
please add TODOs where relevant if you want to follow up in different PRs
079d913 to
783b470
Compare
| } | ||
| supported := 0 | ||
| for _, arch := range supportedArchitectures { | ||
| ami, err := defaultNodePoolAMI(region, arch, releaseImage) |
There was a problem hiding this comment.
why don't we pass the streamName to defaultNodePoolAMI and let the lookup happen inside so it doesn't rely on the caller?
There was a problem hiding this comment.
Moving the lookup inside defaultNodePoolAMI — I considered it, but it re-couples the function to ReleaseImage, which is exactly what this PR undoes. With *stream.Stream the function is pure and testable without building a full ReleaseImage.
There was a problem hiding this comment.
Thinking out loud it's a fair point. If every caller always does the same StreamForName + defaultNodePoolAMI dance, keeping them separate just spreads boilerplate. I'll fold the lookup back into defaultNodePoolAMI so it takes (region, arch, streamName, releaseImage) and does the resolution internally. That way callers just pass the stream name and the function handles the rest atomically.
d1ba395 to
f9c9326
Compare
|
/lgtm |
|
Scheduling tests matching the |
|
/hold |
Refactor defaultNodePoolAMI() to accept a streamName parameter and
resolve stream metadata via ReleaseImage.StreamForName() instead of
reading ReleaseImage.StreamMetadata directly.
For single-stream payloads (OCP < 5.0), StreamForName("") falls back
to StreamMetadata. For multi-stream payloads (>= 5.0), it looks up
the named stream in OSStreams.
Callers pass "" for now (backward compatible). When CNTRLPLANE-3553
wires spec.osImageStream, only the "" argument changes to the resolved
stream name.
Signed-off-by: Juan Manuel Parrilla Madrid <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
f9c9326 to
b228747
Compare
|
New changes are detected. LGTM label has been removed. |
|
/hold cancel Commit msg + PR desc updated |
|
/verified by unit tests |
|
@jparrill: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
What this PR does / why we need it
Phase 1 step 3 (AWS) of the Dual-Stream RHEL 9/10 NodePool enhancement (OCPSTRAT-3014). Decouples
defaultNodePoolAMI()fromReleaseImage.StreamMetadataso it resolves stream metadata viaStreamForName(), enabling stream-aware boot image resolution.Changes:
defaultNodePoolAMI()— accepts astreamNameparameter, resolves metadata viaReleaseImage.StreamForName(streamName)instead of readingStreamMetadatadirectlyresolveAWSAMI()— passes""as streamName (backward compatible)setAWSConditions()— passes""as streamName (backward compatible)setKarpenterAMILabels()— passes""as streamName (backward compatible)StreamForName("")returnsStreamMetadata(the legacy default). For multi-stream payloads (>= 5.0), callers will pass a resolved stream name to look up the correct entry inOSStreams. When CNTRLPLANE-3553 wires the NodePool API field, only the""arguments change —defaultNodePoolAMIitself needs no modification.This establishes the pattern for other platform resolvers (GCP, Azure, KubeVirt) in CNTRLPLANE-3027.
Which issue(s) this PR fixes
Fixes CNTRLPLANE-3026
Special notes for your reviewer
defaultNodePoolAMIsignature change is not cosmetic — it decouples AMI resolution fromReleaseImage.StreamMetadataso callers can resolve any stream viaStreamForName(). For single-stream payloads (OCP < 5.0),StreamForName("")falls back toStreamMetadata. For multi-stream payloads (>= 5.0), it looks up the named stream inOSStreams.resolveAWSAMIstays unexported — no external callers need it.Checklist
Test plan
🤖 Generated with Claude Code