CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding#8626
CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding#8626muraee wants to merge 4 commits into
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
@muraee: This pull request explicitly references no jira issue. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Skipping CI for Draft Pull Request. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThis PR migrates metrics forwarding configuration from annotation-based to spec-based control. It adds MonitoringSpec, MetricsForwardingSpec, and enums (MetricsForwardingMode, MetricsSet), copies HostedCluster.Spec.Monitoring to HostedControlPlane.Spec.Monitoring with a backward-compatibility shim for the deprecated annotation, computes an effective metrics set for SRE reconciliation, updates controller predicates and gating to use the spec field, and updates unit and e2e tests to drive behavior via the new spec. Sequence Diagram(s)sequenceDiagram
participant User
participant HostedClusterController
participant HostedCluster
participant HostedControlPlane
participant ControlPlaneOperator
participant MetricsForwarder
User->>HostedCluster: set spec.monitoring.metricsForwarding.mode=Enabled
HostedClusterController->>HostedCluster: read Spec.Monitoring
HostedClusterController->>HostedControlPlane: copy Spec.Monitoring (apply deprecated-annotation shim if unset)
ControlPlaneOperator->>HostedControlPlane: read Spec.Monitoring
ControlPlaneOperator->>MetricsForwarder: enable/disable based on MetricsForwarding.Mode and MetricsSet
MetricsForwarder-->>ControlPlaneOperator: status
Suggested reviewers
🚥 Pre-merge checks | ✅ 10 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@muraee: This pull request references CNTRLPLANE-3526 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@api/hypershift/v1beta1/hosted_controlplane.go`:
- Around line 190-195: The Monitoring field in HostedControlPlane currently has
an inconsistent JSON tag; locate the Monitoring declaration (Monitoring
MonitoringSpec) and replace the tag string that contains both
"omitempty,omitzero" with the single "omitzero" form (i.e.,
json:"monitoring,omitzero") so it matches the style used by HostedClusterSpec
and other fields like AutoNode.
In
`@control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go`:
- Around line 1189-1192: The code reads
hcp.Spec.Monitoring.MetricsForwarding.MetricsSet without nil checks which can
panic; update the logic around effectiveMetricsSet (and keep r.MetricsSet
fallback) to first verify hcp.Spec.Monitoring != nil and
hcp.Spec.Monitoring.MetricsForwarding != nil before accessing MetricsSet, and
only override effectiveMetricsSet when those pointers are non-nil and MetricsSet
is non-empty (preserve existing behavior of using metrics.MetricsSet(...) when
present).
In
`@control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.go`:
- Around line 46-49: The predicate function dereferences
cpContext.HCP.Spec.Monitoring.MetricsForwarding.Mode without nil checks which
can panic when Monitoring or MetricsForwarding are nil; update
predicate(cpContext component.WorkloadContext) to first check cpContext.HCP,
cpContext.HCP.Spec, cpContext.HCP.Spec.Monitoring and
cpContext.HCP.Spec.Monitoring.MetricsForwarding for nil before reading Mode and
combine that guarded check with the existing DisableMonitoringServices
annotation check (hyperv1.DisableMonitoringServices) so the function returns
false (no metrics forwarding) when any of the intermediate structs are nil and
only true when Mode == hyperv1.MetricsForwardingModeEnabled and monitoring is
not disabled.
In
`@control-plane-operator/controllers/hostedcontrolplane/v2/metrics_proxy/component.go`:
- Around line 69-72: The predicate function reads
cpContext.HCP.Spec.Monitoring.MetricsForwarding.Mode without nil guards; add
checks in predicate to ensure cpContext.HCP, cpContext.HCP.Spec, and
cpContext.HCP.Spec.Monitoring are non-nil (and that Monitoring.MetricsForwarding
is present) before accessing Mode, and return false (no reconcile) if any are
nil; keep the existing DisableMonitoringServices annotation check
(hyperv1.DisableMonitoringServices) and only evaluate Mode when the monitoring
structs exist to avoid nil-pointer panics.
In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`:
- Around line 2508-2511: The current fallback flips any non-Enabled mode
(including an explicit Disabled) to Enabled when the deprecated annotation
exists; change the condition to only apply the annotation fallback when the HCP
mode is not explicitly set (e.g., empty/unspecified) rather than any mode other
than Enabled. Concretely, update the check around
hcp.Spec.Monitoring.MetricsForwarding.Mode so it only sets
hyperv1.MetricsForwardingModeEnabled from the deprecated
hcluster.Annotations[hyperv1.EnableMetricsForwarding] when the existing mode is
the unset/zero value (not when it equals hyperv1.MetricsForwardingModeDisabled
or any explicit value).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 9bad19ca-0e30-40f1-88ee-d8aca6240665
⛔ Files ignored due to path filters (44)
api/hypershift/v1beta1/zz_generated.deepcopy.gois excluded by!**/zz_generated*.go,!**/zz_generated*api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**client/applyconfiguration/hypershift/v1beta1/hostedclusterspec.gois excluded by!client/**client/applyconfiguration/hypershift/v1beta1/hostedcontrolplanespec.gois excluded by!client/**client/applyconfiguration/hypershift/v1beta1/metricsforwardingspec.gois excluded by!client/**client/applyconfiguration/hypershift/v1beta1/monitoringspec.gois excluded by!client/**client/applyconfiguration/utils.gois excluded by!client/**cmd/install/assets/crds/hypershift-operator/tests/hostedclusters.hypershift.openshift.io/stable.hostedclusters.monitoring.testsuite.yamlis excluded by!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamldocs/content/reference/aggregated-docs.mdis excluded by!docs/content/reference/aggregated-docs.mddocs/content/reference/api.mdis excluded by!docs/content/reference/api.mdvendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hosted_controlplane.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.gois excluded by!vendor/**,!**/vendor/**,!**/zz_generated*.go,!**/zz_generated*
📒 Files selected for processing (11)
api/hypershift/v1beta1/hosted_controlplane.goapi/hypershift/v1beta1/hostedcluster_types.gocontrol-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component_test.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/metrics_proxy/component.gocontrol-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.gocontrol-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.gohypershift-operator/controllers/hostedcluster/hostedcluster_controller.gotest/e2e/util/util_metrics_proxy.gotest/e2e/v2/tests/hosted_cluster_metrics_test.go
996709e to
89814ff
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 951-952: The code directly reads
hcp.Spec.Monitoring.MetricsForwarding.Mode which can panic when Monitoring or
MetricsForwarding is nil; update the conditional to first nil-check
hcp.Spec.Monitoring and hcp.Spec.Monitoring.MetricsForwarding and only compare
Mode to hyperv1.MetricsForwardingModeEnabled when both are non-nil, otherwise
treat it as not enabled and call return k8sutil.DeleteAllIfNeeded(ctx, r.client,
deployment, cm, servingCA, podMonitor); ensure you reference the same symbols
(hcp, Spec, Monitoring, MetricsForwarding, Mode,
hyperv1.MetricsForwardingModeEnabled) so the branch exactly mirrors the intended
behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 9bc66ee1-87cc-4ca1-8f51-fda6d9f9d236
⛔ Files ignored due to path filters (44)
api/hypershift/v1beta1/zz_generated.deepcopy.gois excluded by!**/zz_generated*.go,!**/zz_generated*api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yamlis excluded by!**/zz_generated.featuregated-crd-manifests/**client/applyconfiguration/hypershift/v1beta1/hostedclusterspec.gois excluded by!client/**client/applyconfiguration/hypershift/v1beta1/hostedcontrolplanespec.gois excluded by!client/**client/applyconfiguration/hypershift/v1beta1/metricsforwardingspec.gois excluded by!client/**client/applyconfiguration/hypershift/v1beta1/monitoringspec.gois excluded by!client/**client/applyconfiguration/utils.gois excluded by!client/**cmd/install/assets/crds/hypershift-operator/tests/hostedclusters.hypershift.openshift.io/stable.hostedclusters.monitoring.testsuite.yamlis excluded by!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamlcmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yamlis excluded by!**/zz_generated.crd-manifests/**,!cmd/install/assets/**/*.yamldocs/content/reference/aggregated-docs.mdis excluded by!docs/content/reference/aggregated-docs.mddocs/content/reference/api.mdis excluded by!docs/content/reference/api.mdvendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hosted_controlplane.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.gois excluded by!vendor/**,!**/vendor/**vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.gois excluded by!vendor/**,!**/vendor/**,!**/zz_generated*.go,!**/zz_generated*
📒 Files selected for processing (11)
api/hypershift/v1beta1/hosted_controlplane.goapi/hypershift/v1beta1/hostedcluster_types.gocontrol-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component_test.gocontrol-plane-operator/controllers/hostedcontrolplane/v2/metrics_proxy/component.gocontrol-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.gocontrol-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.gohypershift-operator/controllers/hostedcluster/hostedcluster_controller.gotest/e2e/util/util_metrics_proxy.gotest/e2e/v2/tests/hosted_cluster_metrics_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
- control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.go
- test/e2e/v2/tests/hosted_cluster_metrics_test.go
- api/hypershift/v1beta1/hosted_controlplane.go
- control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
- test/e2e/util/util_metrics_proxy.go
- control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component_test.go
- control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #8626 +/- ##
==========================================
+ Coverage 41.79% 41.83% +0.03%
==========================================
Files 759 759
Lines 94037 94051 +14
==========================================
+ Hits 39304 39346 +42
+ Misses 51983 51950 -33
- Partials 2750 2755 +5
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
|
Scheduling tests matching the |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: everettraven, muraee The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest |
2 similar comments
|
/retest |
|
/retest |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/verified by e2e-test |
|
@muraee: This PR has been marked as verified by DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/hold |
|
New changes are detected. LGTM label has been removed. |
|
@muraee: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
|
All four jobs fail with the identical root cause. Here is the report: Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryAll four failing Prow jobs (verify-deps, security, okd-scos-images, images) fail identically during the initial git merge step — before any build, test, or verification logic runs. The CI system attempted to merge PR #8626 (commit Root CauseThe root cause is a git merge conflict, not a code or test failure. After PR #8626 was branched from The CI system's merge attempt fails at line 97 of every build log with: Since the merge fails, no CI step (build, test, verification) ever executes. All four jobs exit immediately with status 1. The Recommendations
Evidence
|
Add MonitoringSpec and MetricsForwardingSpec types to HostedCluster and HostedControlPlane specs, replacing the annotation-based EnableMetricsForwarding mechanism with a proper API field. The new API adds: - monitoring.metricsForwarding.mode (Forward/None) to control metrics forwarding per cluster - monitoring.metricsSet (Telemetry/SRE/All) to override the global METRICS_SET environment variable per cluster Signed-off-by: Mulham Raee <[email protected]> Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Generated by: make update Signed-off-by: Mulham Raee <[email protected]> Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Update all consumers to check the new spec field: - HC controller copies Monitoring from HC to HCP with backward compat for the deprecated EnableMetricsForwarding annotation (only when mode is unset; explicit Disabled takes precedence) - CPO resolves per-cluster MetricsSet override before SRE config loading and passes it through ControlPlaneContext - metrics-proxy and endpoint-resolver predicates check Mode enum - HCCO reconcileMetricsForwarder checks spec instead of annotation - HO SRE ConfigMap sync supports per-cluster MetricsSet=SRE Signed-off-by: Mulham Raee <[email protected]> Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Allow the forwarded metrics set (guest-side, via metrics-proxy) to be configured independently from the MC-side ServiceMonitor/PodMonitor relabel configs. When MetricsForwardingSpec.MetricsSet is not set, it falls back to MonitoringSpec.MetricsSet (then to the global METRICS_SET env var). Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
8480bf3 to
f092dcd
Compare
jparrill
left a comment
There was a problem hiding this comment.
Dropped some comments. Thanks!
Two additional items on code that's not in the diff but is affected by this change:
enqueueHostedClustersFunc (hostedcluster_controller.go:3602): The SRE ConfigMap watch only fires when the operator-global metricsSet == SRE. With per-cluster monitoring.metricsSet override, a cluster with metricsSet: SRE won't be re-enqueued when the SRE ConfigMap changes if the operator global is Telemetry. Either broaden the condition to also check individual HostedClusters, or simplify by always watching the SRE ConfigMap (the reconciler already short-circuits when metricsSet != SRE, so the extra enqueue is cheap).
CPO reconcileSREMetricsConfig (hostedcontrolplane_controller.go:~3157): This function still uses r.MetricsSet without checking hcp.Spec.Monitoring.MetricsSet. If the operator global is Telemetry but a specific cluster has monitoring.metricsSet: SRE, the HO will sync SRE config to the control plane namespace, but the CPO will skip loading it. Should mirror the override logic from the HO, otherwise the per-cluster metricsSet feature is only half-working.
| manifests.MetricsForwarderServingCA(), | ||
| manifests.MetricsForwarderPodMonitor(), | ||
| }, | ||
| expectCleanup: true, |
There was a problem hiding this comment.
All 4 test cases have expectCleanup: true — there's no positive test for Mode=Forward verifying that resources are preserved. A regression that unconditionally deletes resources would pass all these tests.
Something like:
{
name: "When metrics forwarding mode is Forward, it should not delete resources",
monitoring: hyperv1.MonitoringSpec{
MetricsForwarding: hyperv1.MetricsForwardingSpec{
Mode: hyperv1.MetricsForwardingModeForward,
},
},
existingObjects: []client.Object{...},
expectCleanup: false,
},| } | ||
| } | ||
|
|
||
| metricsSet := r.MetricsSet |
There was a problem hiding this comment.
This override pattern is duplicated in reconcileSREMetricsConfig (~line 5181). If the fallback logic ever changes, both sites need updating in lockstep. Consider extracting a small helper:
func (r *HostedClusterReconciler) effectiveMetricsSet(monitoring hyperv1.MonitoringSpec) metrics.MetricsSet {
if monitoring.MetricsSet != "" {
return metrics.MetricsSet(monitoring.MetricsSet)
}
return r.MetricsSet
}| hcp.Spec.Monitoring = hcluster.Spec.Monitoring | ||
| // Backward compat: if the deprecated annotation is present and the spec mode is not explicitly set, | ||
| // enable metrics forwarding on the HCP so that downstream consumers (CPO, HCCO) use the spec field. | ||
| // An explicit Disabled mode takes precedence over the annotation. |
There was a problem hiding this comment.
nit: the comment says "Disabled mode" but the enum value is MetricsForwardingModeNone ("None"). Should read "An explicit None mode takes precedence over the annotation" to match the API.
| // +required | ||
| Mode MetricsForwardingMode `json:"mode,omitempty"` | ||
|
|
||
| // metricsSet specifies which set of metrics to forward to the hosted |
There was a problem hiding this comment.
Having metricsSet at two nesting levels (monitoring.metricsSet and monitoring.metricsForwarding.metricsSet) with the same field name but different scopes makes the precedence hard to discover — you need to read 3 files across HO and CPO to understand the chain: forwarding.metricsSet > monitoring.metricsSet > operator env var.
It might be worth evaluating whether renaming one of the two fields would make the intent clearer — e.g. calling the inner one forwardingMetricsSet or the outer one defaultMetricsSet. That way users can tell from the field name alone which scope each one controls, without needing to trace the resolution chain through the controller code.
| } | ||
|
|
||
| func predicate(cpContext component.WorkloadContext) (bool, error) { | ||
| func Predicate(cpContext component.WorkloadContext) (bool, error) { |
There was a problem hiding this comment.
nit: now that this is exported and shared with metrics_proxy, a short godoc would help:
// Predicate returns true when metrics forwarding components should be deployed.
func Predicate(cpContext component.WorkloadContext) (bool, error) {| name: "When only EnableMetricsForwarding is present, it should return true", | ||
| annotations: map[string]string{ | ||
| hyperv1.EnableMetricsForwarding: "true", | ||
| name: "When metrics forwarding mode is Enabled, it should return true", |
There was a problem hiding this comment.
nit: test names say "Enabled" / "Disabled" but the actual enum values are Forward / None. Using the enum terminology in test names makes it easier to grep for coverage of specific values — e.g. "When metrics forwarding mode is Forward, it should return true".
|
|
||
| func adaptDeployment(cpContext component.WorkloadContext, deployment *appsv1.Deployment) error { | ||
| metricsSet := cpContext.MetricsSet | ||
| if fwdSet := cpContext.HCP.Spec.Monitoring.MetricsForwarding.MetricsSet; fwdSet != "" { |
There was a problem hiding this comment.
This casts hyperv1.MetricsSet → metrics.MetricsSet — two distinct string types defined in separate packages with identical values. The CRD enum validation prevents invalid values at admission, so this is safe at runtime. But if someone adds a value to one type without the other, the cast silently produces an unrecognized value.
Worth either a sync-warning comment on one of the two type definitions, or a small unit test asserting the constant sets match.
| servicePublishingStrategy: | ||
| type: Route | ||
| route: {} | ||
| expectedError: "spec.monitoring.metricsForwarding.metricsSet" |
There was a problem hiding this comment.
Missing a couple of negative tests to exercise the structural constraints:
monitoring: {}— should fail withminProperties: 1monitoring: { metricsForwarding: { metricsSet: SRE } }— should fail becausemodeis required whenmetricsForwardingis present
The current suite validates enum values well but doesn't cover these structural rejection cases.
Summary
spec.monitoring.metricsForwardingAPI on HostedCluster and HostedControlPlane, replacing the annotation-basedhypershift.openshift.io/enable-metrics-forwardingmechanismmetricsSetfield (Telemetry/SRE/All) that overrides the globalMETRICS_SETenv var on the HyperShift OperatorTest plan
TestPredicate)TestReconcileMetricsForwarder)make verifypasses (0 lint issues)make api-lint-fixpasses (0 issues)spec.monitoring.metricsForwarding.mode: Enabled🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Bug Fixes
Tests