Skip to content

CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding#8626

Open
muraee wants to merge 4 commits into
openshift:mainfrom
muraee:metrics-forwarding-api
Open

CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding#8626
muraee wants to merge 4 commits into
openshift:mainfrom
muraee:metrics-forwarding-api

Conversation

@muraee

@muraee muraee commented May 28, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Introduces spec.monitoring.metricsForwarding API on HostedCluster and HostedControlPlane, replacing the annotation-based hypershift.openshift.io/enable-metrics-forwarding mechanism
  • Adds per-cluster metricsSet field (Telemetry/SRE/All) that overrides the global METRICS_SET env var on the HyperShift Operator
  • Updates all consumers (CPO predicates, HCCO, HO SRE config sync) to use the new spec field
  • Maintains backward compatibility: the deprecated annotation is honored when the spec field is not set

Test plan

  • Unit tests pass for endpoint-resolver predicate (TestPredicate)
  • Unit tests pass for HCCO metrics forwarder (TestReconcileMetricsForwarder)
  • make verify passes (0 lint issues)
  • make api-lint-fix passes (0 issues)
  • Envtest validation YAML added for monitoring field enum validation
  • E2E: verify metrics forwarding works with spec.monitoring.metricsForwarding.mode: Enabled
  • E2E: verify backward compat with annotation still works

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added structured monitoring configuration on HostedCluster and HostedControlPlane to configure metrics forwarding (Enabled/Disabled) and to select metrics set (Telemetry, SRE, All). Deprecated metrics-forwarding annotation is preserved for backward compatibility when the spec is unset.
  • Bug Fixes

    • Controllers and components now consistently honor spec-driven monitoring settings for enabling/disabling forwarding and SRE metrics selection.
  • Tests

    • Updated unit and e2e tests and helpers to exercise the new spec-driven monitoring behavior.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 28, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@muraee: This pull request explicitly references no jira issue.

Details

In response to this:

Summary

  • Introduces spec.monitoring.metricsForwarding API on HostedCluster and HostedControlPlane, replacing the annotation-based hypershift.openshift.io/enable-metrics-forwarding mechanism
  • Adds per-cluster metricsSet field (Telemetry/SRE/All) that overrides the global METRICS_SET env var on the HyperShift Operator
  • Updates all consumers (CPO predicates, HCCO, HO SRE config sync) to use the new spec field
  • Maintains backward compatibility: the deprecated annotation is honored when the spec field is not set

Test plan

  • Unit tests pass for endpoint-resolver predicate (TestPredicate)
  • Unit tests pass for HCCO metrics forwarder (TestReconcileMetricsForwarder)
  • make verify passes (0 lint issues)
  • make api-lint-fix passes (0 issues)
  • Envtest validation YAML added for monitoring field enum validation
  • E2E: verify metrics forwarding works with spec.monitoring.metricsForwarding.mode: Enabled
  • E2E: verify backward compat with annotation still works

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 28, 2026
@openshift-ci

openshift-ci Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR migrates metrics forwarding configuration from annotation-based to spec-based control. It adds MonitoringSpec, MetricsForwardingSpec, and enums (MetricsForwardingMode, MetricsSet), copies HostedCluster.Spec.Monitoring to HostedControlPlane.Spec.Monitoring with a backward-compatibility shim for the deprecated annotation, computes an effective metrics set for SRE reconciliation, updates controller predicates and gating to use the spec field, and updates unit and e2e tests to drive behavior via the new spec.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant HostedClusterController
  participant HostedCluster
  participant HostedControlPlane
  participant ControlPlaneOperator
  participant MetricsForwarder

  User->>HostedCluster: set spec.monitoring.metricsForwarding.mode=Enabled
  HostedClusterController->>HostedCluster: read Spec.Monitoring
  HostedClusterController->>HostedControlPlane: copy Spec.Monitoring (apply deprecated-annotation shim if unset)
  ControlPlaneOperator->>HostedControlPlane: read Spec.Monitoring
  ControlPlaneOperator->>MetricsForwarder: enable/disable based on MetricsForwarding.Mode and MetricsSet
  MetricsForwarder-->>ControlPlaneOperator: status
Loading

Suggested reviewers

  • jparrill
  • sdminonne
🚥 Pre-merge checks | ✅ 10 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Test Structure And Quality ⚠️ Warning Unit tests lack meaningful failure messages on assertions: TestPredicate, TestReconcileMetricsForwarder, TestReconcileHostedControlPlaneMonitoring missing diagnostic context. Add descriptive failure messages to Expect() assertions (e.g., Expect(err).NotTo(HaveOccurred(), "should enable metrics forwarding on HCP") to help diagnose failures.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main change: adding a new spec.monitoring API field to enable metrics forwarding configuration.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names are stable and deterministic. No dynamic content found in test case names (no fmt.Sprintf, string concatenation, UUIDs, timestamps, or generated identifiers detected).
Topology-Aware Scheduling Compatibility ✅ Passed PR only adds API schema and controller logic for metrics forwarding; no deployment manifests, affinity rules, topology constraints, or topology-unaware replica logic.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed New e2e tests contain no hardcoded IPv4 addresses, IPv4-specific parsing, or external connectivity requirements. All connections use DNS service names and cluster-internal services.
No-Weak-Crypto ✅ Passed No weak cryptography usage found in PR: no MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB mode, custom crypto implementations, or non-constant-time secret comparisons detected in modified code.
Container-Privileges ✅ Passed PR adds monitoring API fields; no privileged container settings (privileged: true, hostPID, hostNetwork, hostIPC, SYS_ADMIN, allowPrivilegeEscalation: true, runAsUser: 0) introduced.
No-Sensitive-Data-In-Logs ✅ Passed PR adds monitoring API fields (Enabled/Disabled enums and metrics sets) with no logging of spec values; code reads these fields for feature gating but never logs them.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@muraee muraee marked this pull request as ready for review May 28, 2026 14:53
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 28, 2026
@muraee muraee changed the title NO-JIRA: Add spec.monitoring API for metrics forwarding CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding May 28, 2026
@openshift-ci-robot

openshift-ci-robot commented May 28, 2026

Copy link
Copy Markdown

@muraee: This pull request references CNTRLPLANE-3526 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Introduces spec.monitoring.metricsForwarding API on HostedCluster and HostedControlPlane, replacing the annotation-based hypershift.openshift.io/enable-metrics-forwarding mechanism
  • Adds per-cluster metricsSet field (Telemetry/SRE/All) that overrides the global METRICS_SET env var on the HyperShift Operator
  • Updates all consumers (CPO predicates, HCCO, HO SRE config sync) to use the new spec field
  • Maintains backward compatibility: the deprecated annotation is honored when the spec field is not set

Test plan

  • Unit tests pass for endpoint-resolver predicate (TestPredicate)
  • Unit tests pass for HCCO metrics forwarder (TestReconcileMetricsForwarder)
  • make verify passes (0 lint issues)
  • make api-lint-fix passes (0 issues)
  • Envtest validation YAML added for monitoring field enum validation
  • E2E: verify metrics forwarding works with spec.monitoring.metricsForwarding.mode: Enabled
  • E2E: verify backward compat with annotation still works

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/testing Indicates the PR includes changes for e2e testing and removed do-not-merge/needs-area labels May 28, 2026
@openshift-ci openshift-ci Bot requested review from csrwng and jparrill May 28, 2026 14:53
@muraee

muraee commented May 28, 2026

Copy link
Copy Markdown
Contributor Author

@coderabbitai review

@coderabbitai

coderabbitai Bot commented May 28, 2026

Copy link
Copy Markdown
Contributor
✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8626 May 28, 2026 14:58 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@api/hypershift/v1beta1/hosted_controlplane.go`:
- Around line 190-195: The Monitoring field in HostedControlPlane currently has
an inconsistent JSON tag; locate the Monitoring declaration (Monitoring
MonitoringSpec) and replace the tag string that contains both
"omitempty,omitzero" with the single "omitzero" form (i.e.,
json:"monitoring,omitzero") so it matches the style used by HostedClusterSpec
and other fields like AutoNode.

In
`@control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go`:
- Around line 1189-1192: The code reads
hcp.Spec.Monitoring.MetricsForwarding.MetricsSet without nil checks which can
panic; update the logic around effectiveMetricsSet (and keep r.MetricsSet
fallback) to first verify hcp.Spec.Monitoring != nil and
hcp.Spec.Monitoring.MetricsForwarding != nil before accessing MetricsSet, and
only override effectiveMetricsSet when those pointers are non-nil and MetricsSet
is non-empty (preserve existing behavior of using metrics.MetricsSet(...) when
present).

In
`@control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.go`:
- Around line 46-49: The predicate function dereferences
cpContext.HCP.Spec.Monitoring.MetricsForwarding.Mode without nil checks which
can panic when Monitoring or MetricsForwarding are nil; update
predicate(cpContext component.WorkloadContext) to first check cpContext.HCP,
cpContext.HCP.Spec, cpContext.HCP.Spec.Monitoring and
cpContext.HCP.Spec.Monitoring.MetricsForwarding for nil before reading Mode and
combine that guarded check with the existing DisableMonitoringServices
annotation check (hyperv1.DisableMonitoringServices) so the function returns
false (no metrics forwarding) when any of the intermediate structs are nil and
only true when Mode == hyperv1.MetricsForwardingModeEnabled and monitoring is
not disabled.

In
`@control-plane-operator/controllers/hostedcontrolplane/v2/metrics_proxy/component.go`:
- Around line 69-72: The predicate function reads
cpContext.HCP.Spec.Monitoring.MetricsForwarding.Mode without nil guards; add
checks in predicate to ensure cpContext.HCP, cpContext.HCP.Spec, and
cpContext.HCP.Spec.Monitoring are non-nil (and that Monitoring.MetricsForwarding
is present) before accessing Mode, and return false (no reconcile) if any are
nil; keep the existing DisableMonitoringServices annotation check
(hyperv1.DisableMonitoringServices) and only evaluate Mode when the monitoring
structs exist to avoid nil-pointer panics.

In `@hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go`:
- Around line 2508-2511: The current fallback flips any non-Enabled mode
(including an explicit Disabled) to Enabled when the deprecated annotation
exists; change the condition to only apply the annotation fallback when the HCP
mode is not explicitly set (e.g., empty/unspecified) rather than any mode other
than Enabled. Concretely, update the check around
hcp.Spec.Monitoring.MetricsForwarding.Mode so it only sets
hyperv1.MetricsForwardingModeEnabled from the deprecated
hcluster.Annotations[hyperv1.EnableMetricsForwarding] when the existing mode is
the unset/zero value (not when it equals hyperv1.MetricsForwardingModeDisabled
or any explicit value).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 9bad19ca-0e30-40f1-88ee-d8aca6240665

📥 Commits

Reviewing files that changed from the base of the PR and between e8aa9bb and 996709e.

⛔ Files ignored due to path filters (44)
  • api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !**/zz_generated*.go, !**/zz_generated*
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • client/applyconfiguration/hypershift/v1beta1/hostedclusterspec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/hostedcontrolplanespec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/metricsforwardingspec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/monitoringspec.go is excluded by !client/**
  • client/applyconfiguration/utils.go is excluded by !client/**
  • cmd/install/assets/crds/hypershift-operator/tests/hostedclusters.hypershift.openshift.io/stable.hostedclusters.monitoring.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hosted_controlplane.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**, !**/zz_generated*.go, !**/zz_generated*
📒 Files selected for processing (11)
  • api/hypershift/v1beta1/hosted_controlplane.go
  • api/hypershift/v1beta1/hostedcluster_types.go
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component_test.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/metrics_proxy/component.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
  • test/e2e/util/util_metrics_proxy.go
  • test/e2e/v2/tests/hosted_cluster_metrics_test.go

Comment thread api/hypershift/v1beta1/hosted_controlplane.go
Comment thread hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go Outdated
Comment thread api/hypershift/v1beta1/hostedcluster_types.go
@muraee muraee force-pushed the metrics-forwarding-api branch from 996709e to 89814ff Compare May 28, 2026 15:15
@github-actions github-actions Bot temporarily deployed to docs-preview/pr-8626 May 28, 2026 15:21 Inactive

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go`:
- Around line 951-952: The code directly reads
hcp.Spec.Monitoring.MetricsForwarding.Mode which can panic when Monitoring or
MetricsForwarding is nil; update the conditional to first nil-check
hcp.Spec.Monitoring and hcp.Spec.Monitoring.MetricsForwarding and only compare
Mode to hyperv1.MetricsForwardingModeEnabled when both are non-nil, otherwise
treat it as not enabled and call return k8sutil.DeleteAllIfNeeded(ctx, r.client,
deployment, cm, servingCA, podMonitor); ensure you reference the same symbols
(hcp, Spec, Monitoring, MetricsForwarding, Mode,
hyperv1.MetricsForwardingModeEnabled) so the branch exactly mirrors the intended
behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 9bc66ee1-87cc-4ca1-8f51-fda6d9f9d236

📥 Commits

Reviewing files that changed from the base of the PR and between 996709e and 89814ff.

⛔ Files ignored due to path filters (44)
  • api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !**/zz_generated*.go, !**/zz_generated*
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedclusters.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterUpdateAcceptRisks.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ClusterVersionOperatorConfiguration.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDC.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUIDAndExtraClaimMappings.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ExternalOIDCWithUpstreamParity.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HCPEtcdBackup.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/HyperShiftOnlyDynamicResourceAllocation.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/ImageStreamImportMode.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/KMSEncryptionProvider.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/hostedcontrolplanes.hypershift.openshift.io/TLSAdherence.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • client/applyconfiguration/hypershift/v1beta1/hostedclusterspec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/hostedcontrolplanespec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/metricsforwardingspec.go is excluded by !client/**
  • client/applyconfiguration/hypershift/v1beta1/monitoringspec.go is excluded by !client/**
  • client/applyconfiguration/utils.go is excluded by !client/**
  • cmd/install/assets/crds/hypershift-operator/tests/hostedclusters.hypershift.openshift.io/stable.hostedclusters.monitoring.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedclusters-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/hostedcontrolplanes-Hypershift-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hosted_controlplane.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/hostedcluster_types.go is excluded by !vendor/**, !**/vendor/**
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/zz_generated.deepcopy.go is excluded by !vendor/**, !**/vendor/**, !**/zz_generated*.go, !**/zz_generated*
📒 Files selected for processing (11)
  • api/hypershift/v1beta1/hosted_controlplane.go
  • api/hypershift/v1beta1/hostedcluster_types.go
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component_test.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/metrics_proxy/component.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_controller.go
  • test/e2e/util/util_metrics_proxy.go
  • test/e2e/v2/tests/hosted_cluster_metrics_test.go
🚧 Files skipped from review as they are similar to previous changes (7)
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component.go
  • test/e2e/v2/tests/hosted_cluster_metrics_test.go
  • api/hypershift/v1beta1/hosted_controlplane.go
  • control-plane-operator/controllers/hostedcontrolplane/hostedcontrolplane_controller.go
  • test/e2e/util/util_metrics_proxy.go
  • control-plane-operator/controllers/hostedcontrolplane/v2/endpoint_resolver/component_test.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources_test.go

@codecov

codecov Bot commented May 28, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 80.76923% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 41.83%. Comparing base (7507291) to head (f092dcd).

Files with missing lines Patch % Lines
...trollers/hostedcluster/hostedcluster_controller.go 83.33% 2 Missing and 1 partial ⚠️
...stedcontrolplane/v2/endpoint_resolver/component.go 66.66% 1 Missing ⚠️
...s/hostedcontrolplane/v2/metrics_proxy/component.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8626      +/-   ##
==========================================
+ Coverage   41.79%   41.83%   +0.03%     
==========================================
  Files         759      759              
  Lines       94037    94051      +14     
==========================================
+ Hits        39304    39346      +42     
+ Misses      51983    51950      -33     
- Partials     2750     2755       +5     
Files with missing lines Coverage Δ
.../hostedcontrolplane/v2/metrics_proxy/deployment.go 86.02% <100.00%> (+13.84%) ⬆️
...rconfigoperator/controllers/resources/resources.go 56.70% <100.00%> (ø)
...stedcontrolplane/v2/endpoint_resolver/component.go 12.00% <66.66%> (-3.39%) ⬇️
...s/hostedcontrolplane/v2/metrics_proxy/component.go 0.00% <0.00%> (ø)
...trollers/hostedcluster/hostedcluster_controller.go 46.24% <83.33%> (+0.35%) ⬆️
Flag Coverage Δ
cmd-support 35.11% <ø> (ø)
cpo-hostedcontrolplane 44.23% <71.42%> (+0.12%) ⬆️
cpo-other 43.45% <100.00%> (ø)
hypershift-operator 51.93% <83.33%> (+0.05%) ⬆️
other 31.56% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Jun 11, 2026
@openshift-merge-bot

Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-22
/test e2e-aws-4-22
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-v2-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws
/test e2e-v2-gke

@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: everettraven, muraee

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 11, 2026
@muraee

muraee commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

/retest

2 similar comments
@muraee

muraee commented Jun 12, 2026

Copy link
Copy Markdown
Contributor Author

/retest

@muraee

muraee commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

/retest

@hypershift-jira-solve-ci

Copy link
Copy Markdown

AI Test Failure Analysis

Job: pull-ci-openshift-hypershift-main-e2e-aks | Build: 2066461913847435264 | Cost: $4.6096885 | Failed step: hypershift-azure-run-e2e

View full analysis report


Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6

@muraee

muraee commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

/verified by e2e-test

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Jun 15, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@muraee: This PR has been marked as verified by e2e-test.

Details

In response to this:

/verified by e2e-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 712ba58 and 2 for PR HEAD db1b535 in total

@hypershift-jira-solve-ci

Copy link
Copy Markdown

AI Test Failure Analysis

Job: pull-ci-openshift-hypershift-main-e2e-aws | Build: 2066512325115908096 | Cost: $1.87851825 | Failed step: hypershift-aws-run-e2e-nested

View full analysis report


Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6

@openshift-merge-bot

Copy link
Copy Markdown
Contributor

/retest-required

Remaining retests: 0 against base HEAD 21c7aa8 and 1 for PR HEAD db1b535 in total

@muraee

muraee commented Jun 15, 2026

Copy link
Copy Markdown
Contributor Author

/hold

@openshift-ci openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 15, 2026
@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Jun 16, 2026
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 16, 2026
@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

New changes are detected. LGTM label has been removed.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 16, 2026
@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

@muraee: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aks-4-22 db1b535 link true /test e2e-aks-4-22
ci/prow/e2e-aws-4-22 db1b535 link true /test e2e-aws-4-22

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@hypershift-jira-solve-ci

hypershift-jira-solve-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown

All four jobs fail with the identical root cause. Here is the report:

Test Failure Analysis Complete

Job Information

  • Prow Job: pull-ci-openshift-hypershift-main-verify-deps, pull-ci-openshift-hypershift-main-security, pull-ci-openshift-hypershift-main-okd-scos-images, pull-ci-openshift-hypershift-main-images
  • Build IDs: 2066923637239189504, 2066923637214023680, 2066923637180469248, 2066923637130137600
  • PR: CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding #8626 (CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding)
  • All 4 jobs fail with the same root cause

Test Failure Analysis

Error

CONFLICT (content): Merge conflict in hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
Automatic merge failed; fix conflicts and then commit the result.
# Error: exit status 1

Summary

All four failing Prow jobs (verify-deps, security, okd-scos-images, images) fail identically during the initial git merge step — before any build, test, or verification logic runs. The CI system attempted to merge PR #8626 (commit 8480bf313a) onto the current main branch (commit 392fd5a13, from the merge of PR #8649 by yiraeChristineKim/ACM-34234), and git could not auto-resolve a content conflict in hostedcluster_controller_test.go. The tide error is a downstream consequence — Tide cannot merge the PR because the required CI checks are failing. No product bug or test flake is involved.

Root Cause

The root cause is a git merge conflict, not a code or test failure. After PR #8626 was branched from main, PR #8649 (ACM-34234 by yiraeChristineKim) was merged into main. Both PRs modified the same file — hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go — in incompatible ways that git cannot auto-resolve.

The CI system's merge attempt fails at line 97 of every build log with:

CONFLICT (content): Merge conflict in hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
Automatic merge failed; fix conflicts and then commit the result.

Since the merge fails, no CI step (build, test, verification) ever executes. All four jobs exit immediately with status 1. The tide error state is simply Tide reporting that required status checks have not passed, which prevents automatic merge.

Recommendations
  1. Rebase PR CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding #8626 onto the latest main branch and resolve the merge conflict in hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go manually.
  2. Review the changes from PR ACM-34234: build(cli): rename hcp archives to include OS and arch in filename #8649 (ACM-34234) to understand what was modified in the test file, then integrate those changes with the PR CNTRLPLANE-3526: Add spec.monitoring API for metrics forwarding #8626 modifications.
  3. Push the rebased branch — all four CI jobs will automatically re-trigger and should pass once the conflict is resolved.
  4. No code fix is needed in the product — this is purely a branch synchronization issue.
Evidence
Evidence Detail
Conflicting file hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
PR commit 8480bf313a06747bd4dea70c6ac7082fff6235b0
Main branch HEAD 392fd5a1304f98bdfe5edf10dd2161986b6d5e9f (merge of PR #8649, ACM-34234)
Conflict PR #8649 by yiraeChristineKim (ACM-34234) — merged into main after #8626 was branched
verify-deps log (line 97) CONFLICT (content): Merge conflict in hypershift-operator/controllers/hostedcluster/hostedcluster_controller_test.go
security log (line 97) Same conflict — identical error
okd-scos-images log (line 97) Same conflict — identical error
images log (line 97) Same conflict — identical error
Failure phase Git merge (pre-build) — no CI step ever executed
tide error Downstream consequence — required checks not passing blocks merge

muraee and others added 4 commits June 17, 2026 12:39
Add MonitoringSpec and MetricsForwardingSpec types to HostedCluster
and HostedControlPlane specs, replacing the annotation-based
EnableMetricsForwarding mechanism with a proper API field.

The new API adds:
- monitoring.metricsForwarding.mode (Forward/None) to control
  metrics forwarding per cluster
- monitoring.metricsSet (Telemetry/SRE/All) to override the global
  METRICS_SET environment variable per cluster

Signed-off-by: Mulham Raee <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Generated by: make update

Signed-off-by: Mulham Raee <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Update all consumers to check the new spec field:

- HC controller copies Monitoring from HC to HCP with backward compat
  for the deprecated EnableMetricsForwarding annotation (only when
  mode is unset; explicit Disabled takes precedence)
- CPO resolves per-cluster MetricsSet override before SRE config
  loading and passes it through ControlPlaneContext
- metrics-proxy and endpoint-resolver predicates check Mode enum
- HCCO reconcileMetricsForwarder checks spec instead of annotation
- HO SRE ConfigMap sync supports per-cluster MetricsSet=SRE

Signed-off-by: Mulham Raee <[email protected]>
Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Allow the forwarded metrics set (guest-side, via metrics-proxy) to be
configured independently from the MC-side ServiceMonitor/PodMonitor
relabel configs. When MetricsForwardingSpec.MetricsSet is not set, it
falls back to MonitoringSpec.MetricsSet (then to the global METRICS_SET
env var).

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
@muraee muraee force-pushed the metrics-forwarding-api branch from 8480bf3 to f092dcd Compare June 17, 2026 10:44
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 17, 2026

@jparrill jparrill left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropped some comments. Thanks!

Two additional items on code that's not in the diff but is affected by this change:

enqueueHostedClustersFunc (hostedcluster_controller.go:3602): The SRE ConfigMap watch only fires when the operator-global metricsSet == SRE. With per-cluster monitoring.metricsSet override, a cluster with metricsSet: SRE won't be re-enqueued when the SRE ConfigMap changes if the operator global is Telemetry. Either broaden the condition to also check individual HostedClusters, or simplify by always watching the SRE ConfigMap (the reconciler already short-circuits when metricsSet != SRE, so the extra enqueue is cheap).

CPO reconcileSREMetricsConfig (hostedcontrolplane_controller.go:~3157): This function still uses r.MetricsSet without checking hcp.Spec.Monitoring.MetricsSet. If the operator global is Telemetry but a specific cluster has monitoring.metricsSet: SRE, the HO will sync SRE config to the control plane namespace, but the CPO will skip loading it. Should mirror the override logic from the HO, otherwise the per-cluster metricsSet feature is only half-working.

manifests.MetricsForwarderServingCA(),
manifests.MetricsForwarderPodMonitor(),
},
expectCleanup: true,

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All 4 test cases have expectCleanup: true — there's no positive test for Mode=Forward verifying that resources are preserved. A regression that unconditionally deletes resources would pass all these tests.

Something like:

{
    name: "When metrics forwarding mode is Forward, it should not delete resources",
    monitoring: hyperv1.MonitoringSpec{
        MetricsForwarding: hyperv1.MetricsForwardingSpec{
            Mode: hyperv1.MetricsForwardingModeForward,
        },
    },
    existingObjects: []client.Object{...},
    expectCleanup: false,
},

}
}

metricsSet := r.MetricsSet

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This override pattern is duplicated in reconcileSREMetricsConfig (~line 5181). If the fallback logic ever changes, both sites need updating in lockstep. Consider extracting a small helper:

func (r *HostedClusterReconciler) effectiveMetricsSet(monitoring hyperv1.MonitoringSpec) metrics.MetricsSet {
    if monitoring.MetricsSet != "" {
        return metrics.MetricsSet(monitoring.MetricsSet)
    }
    return r.MetricsSet
}

hcp.Spec.Monitoring = hcluster.Spec.Monitoring
// Backward compat: if the deprecated annotation is present and the spec mode is not explicitly set,
// enable metrics forwarding on the HCP so that downstream consumers (CPO, HCCO) use the spec field.
// An explicit Disabled mode takes precedence over the annotation.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the comment says "Disabled mode" but the enum value is MetricsForwardingModeNone ("None"). Should read "An explicit None mode takes precedence over the annotation" to match the API.

// +required
Mode MetricsForwardingMode `json:"mode,omitempty"`

// metricsSet specifies which set of metrics to forward to the hosted

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having metricsSet at two nesting levels (monitoring.metricsSet and monitoring.metricsForwarding.metricsSet) with the same field name but different scopes makes the precedence hard to discover — you need to read 3 files across HO and CPO to understand the chain: forwarding.metricsSet > monitoring.metricsSet > operator env var.

It might be worth evaluating whether renaming one of the two fields would make the intent clearer — e.g. calling the inner one forwardingMetricsSet or the outer one defaultMetricsSet. That way users can tell from the field name alone which scope each one controls, without needing to trace the resolution chain through the controller code.

}

func predicate(cpContext component.WorkloadContext) (bool, error) {
func Predicate(cpContext component.WorkloadContext) (bool, error) {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: now that this is exported and shared with metrics_proxy, a short godoc would help:

// Predicate returns true when metrics forwarding components should be deployed.
func Predicate(cpContext component.WorkloadContext) (bool, error) {

name: "When only EnableMetricsForwarding is present, it should return true",
annotations: map[string]string{
hyperv1.EnableMetricsForwarding: "true",
name: "When metrics forwarding mode is Enabled, it should return true",

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: test names say "Enabled" / "Disabled" but the actual enum values are Forward / None. Using the enum terminology in test names makes it easier to grep for coverage of specific values — e.g. "When metrics forwarding mode is Forward, it should return true".


func adaptDeployment(cpContext component.WorkloadContext, deployment *appsv1.Deployment) error {
metricsSet := cpContext.MetricsSet
if fwdSet := cpContext.HCP.Spec.Monitoring.MetricsForwarding.MetricsSet; fwdSet != "" {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This casts hyperv1.MetricsSetmetrics.MetricsSet — two distinct string types defined in separate packages with identical values. The CRD enum validation prevents invalid values at admission, so this is safe at runtime. But if someone adds a value to one type without the other, the cast silently produces an unrecognized value.

Worth either a sync-warning comment on one of the two type definitions, or a small unit test asserting the constant sets match.

servicePublishingStrategy:
type: Route
route: {}
expectedError: "spec.monitoring.metricsForwarding.metricsSet"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing a couple of negative tests to exercise the structural constraints:

  • monitoring: {} — should fail with minProperties: 1
  • monitoring: { metricsForwarding: { metricsSet: SRE } } — should fail because mode is required when metricsForwarding is present

The current suite validates enum values well but doesn't cover these structural rejection cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/testing Indicates the PR includes changes for e2e testing do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants