[release-4.22] OCPBUGS-88356: fix(cpo): deduplicate VPC endpoint subnets by AZ#8724
[release-4.22] OCPBUGS-88356: fix(cpo): deduplicate VPC endpoint subnets by AZ#8724reedcort wants to merge 1 commit into
Conversation
…SubnetsInSameZone When a HCP cluster has multiple NodePools with subnets in the same AWS availability zone, the CPO's VPC endpoint reconciliation fails with DuplicateSubnetsInSameZone because AWS allows at most one subnet per AZ per endpoint. Add a deduplicateSubnetsByAZ method on the reconciler that calls DescribeSubnets to resolve AZ membership, groups subnets by AZ, and picks one per AZ (lexicographically first for determinism). The subnet-to-AZ mapping is cached in an in-memory map on the reconciler to avoid redundant AWS API calls across reconcile loops. On DescribeSubnets failure the controller gracefully degrades by proceeding with the original subnet list, preserving existing behavior. Also adds ec2:DescribeSubnets to the three CPO IAM policies that lacked it. The ROSA-managed ROSAControlPlaneOperatorPolicy requires a separate update with AWS (tracked in ROSAENG-57993). Signed-off-by: Cortney Reed <[email protected]> Commit-Message-Assisted-by: Claude (via Claude Code)
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Skipping CI for Draft Pull Request. |
|
@reedcort: This pull request references Jira Issue OCPBUGS-82443, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: reedcort The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
/jira refresh |
|
@reedcort: This pull request references Jira Issue OCPBUGS-82443, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@reedcort: This pull request references Jira Issue OCPBUGS-82443, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## release-4.22 #8724 +/- ##
================================================
+ Coverage 35.45% 35.49% +0.04%
================================================
Files 767 767
Lines 93724 93798 +74
================================================
+ Hits 33226 33291 +65
- Misses 57785 57794 +9
Partials 2713 2713
🚀 New features to boost your workflow:
|
|
@reedcort: This pull request references Jira Issue OCPBUGS-88356, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@reedcort: This pull request references Jira Issue OCPBUGS-88356, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/jira refresh |
|
@reedcort: This pull request references Jira Issue OCPBUGS-88356, which is valid. The bug has been moved to the POST state. 7 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@reedcort: This pull request references Jira Issue OCPBUGS-88356, which is valid. 7 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@reedcort: This pull request references Jira Issue OCPBUGS-88356, which is valid. 7 validation(s) were run on this bug
The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/lgtm |
|
Scheduling tests matching the |
|
/retest |
|
Fix included in release 5.0.0-0.nightly-2026-06-12-141614 |
|
/retest |
1 similar comment
|
/retest |
|
/test e2e-aws |
|
/retest |
1 similar comment
|
/retest |
|
/test e2e-aws |
|
/retest |
|
/test e2e-aws |
|
@reedcort: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What this PR does / why we need it:
Backport of #8651 to
release-4.22.When a HCP cluster has multiple NodePools with subnets in the same AWS availability zone, the CPO's VPC endpoint
reconciliation fails indefinitely with
DuplicateSubnetsInSameZone. This PR adds AZ-aware subnet deduplicationin the CPO using an in-memory cache on the reconciler.
Manually cherry-picked due to code structure differences between
mainandrelease-4.22(endpoint logic isinline in
reconcileAWSEndpointServiceon 4.22 vs extracted into helper functions onmain).Which issue(s) this PR fixes:
Fixes OCPBUGS-88356
Special notes for your reviewer:
Checklist: