NO-JIRA: deploymentcontroller: add topology-aware scheduling hooks by sanchezl · Pull Request #2276 · openshift/library-go

sanchezl · 2026-06-04T16:56:23Z

Summary

Add topology-aware scheduling helpers for operators that manage Deployments. These provide a consistent approach to replica scaling, pod anti-affinity, rolling update strategy, and control-plane node selection across all supported OpenShift topologies (SNO, DualReplica, TNA, HA, HyperShift).

Motivation: A survey of OpenShift operators found that each one reinvents topology-aware scheduling logic with inconsistent results. For example, the kube-storage-version-migrator-operator is the only HA operator using preferred (instead of required) anti-affinity, which led to a 74-second Available=False outage on a two-node fencing test (OCPBUGS-65984) when both pods landed on the same node during a bootstrap timing race. Every other HA operator (oauth-openshift, metrics-server, image-registry) uses required anti-affinity and survived the same fencing event.

New APIs

Pure functions (for any operator, no DeploymentController required)

TopologyAwareReplicas(topology, controlPlaneNodeCount, maxReplicas) — Returns the desired replica count for a given topology:

Topology	Replicas
SingleReplica	1
DualReplica / HighlyAvailableArbiter / External	2
HighlyAvailable	maxReplicas
Unknown	min(nodeCount, maxReplicas), floor 1

SetTopologyAwareScheduling(deployment, topology, controlPlaneNodeCount, appLabelValue, maxSurge, maxReplicas) — Applies all three topology adjustments (replicas, node selector, scheduling strategy) to a *appsv1.Deployment in one call. Designed for operators that manage Deployments outside DeploymentController, such as workload.Delegate implementations.

DeploymentHookFunc hooks (for DeploymentController users)

WithTopologyAwareReplicasHook(infrastructureLister, nodeLister, maxReplicas) — Sets replicas based on ControlPlaneTopology (delegates to TopologyAwareReplicas).

WithTopologyAwareSchedulingHook(appLabelValue, maxSurge) — Sets rolling update strategy and pod anti-affinity based on replica count (must run after the replicas hook):

replicas > 1: required anti-affinity on kubernetes.io/hostname, maxUnavailable = max(replicas-1, 1)
replicas == 1: nil affinity, maxUnavailable = 1

WithControlPlaneNodeSelectorHook(infrastructureLister) — Adds a control-plane node selector, except on External topology (HyperShift) where no control-plane nodes exist.

Convenience constructor

WithTopologyAwareSchedulingHooks(infrastructureLister, nodeLister, appLabelValue, maxSurge, maxReplicas) — Returns all three hooks in the correct registration order as []DeploymentHookFunc, eliminating the ordering requirement between replicas and scheduling hooks.

Usage examples

DeploymentController (individual hooks):

dc.WithDeploymentHooks(
    deploymentcontroller.WithTopologyAwareReplicasHook(infraLister, nodeLister, 3),
    deploymentcontroller.WithControlPlaneNodeSelectorHook(infraLister),
    deploymentcontroller.WithTopologyAwareSchedulingHook("migrator", 1),
)

DeploymentController (convenience):

dc.WithDeploymentHooks(
    deploymentcontroller.WithTopologyAwareSchedulingHooks(infraLister, nodeLister, "migrator", 1, 3)...,
)

workload.Delegate or custom controller:

deploymentcontroller.SetTopologyAwareScheduling(deployment, topology, len(cpNodes), "my-app", 1, 3)

Cross-operator survey

These hooks codify the pattern already used by most HA operators:

Operator	Replicas	Anti-affinity	maxUnavailable
oauth-openshift	=masterCount	Required	max(count-1, 1)
metrics-server	2	Required	1
image-registry	2	Required	1
console	2	Required	1
KSVMO (current)	count CP nodes	Preferred	0

The hooks are designed for operators using the DeploymentController pattern. The pure functions (TopologyAwareReplicas, SetTopologyAwareScheduling) extend this to any deployment management pattern. Operators with genuinely custom logic (ingress: hostNetwork; image-registry: platform-driven) are unaffected.

Summary by CodeRabbit

New Features
- Deployments now derive replica counts from cluster topology and visible control-plane nodes with sensible caps and fallbacks.
- Rolling-update limits (maxUnavailable/maxSurge) are computed automatically from replica counts.
- Pod anti-affinity is enforced for multi-replica control-plane workloads and removed for single-replica deployments.
- Node selector behavior is topology-aware and omits in-cluster control-plane selectors for external topology.
Tests
- Added comprehensive unit tests covering topology-aware replicas, scheduling, affinity, and node-selector behaviors.

openshift-ci-robot · 2026-06-04T16:56:26Z

@sanchezl: This pull request explicitly references no jira issue.

Details

In response to this:

Summary

Add three DeploymentHookFunc helpers for operators that manage Deployments via the library-go DeploymentController. These provide a consistent, topology-aware approach to replica scaling, pod anti-affinity, and rolling update strategy across all supported OpenShift topologies (SNO, DualReplica, TNA, HA, HyperShift).

Motivation: A survey of OpenShift operators found that each one reinvents topology-aware scheduling logic with inconsistent results. For example, the kube-storage-version-migrator-operator is the only HA operator using preferred (instead of required) anti-affinity, which led to a 74-second Available=False outage on a two-node fencing test (OCPBUGS-65984) when both pods landed on the same node during a bootstrap timing race. Every other HA operator (oauth-openshift, metrics-server, image-registry) uses required anti-affinity and survived the same fencing event.

New hooks

WithTopologyAwareReplicasHook(infrastructureLister, nodeLister, maxReplicas) — Sets replicas based on ControlPlaneTopology:

Topology Replicas

SingleReplica 1

DualReplica / HighlyAvailableArbiter / External 2

HighlyAvailable maxReplicas

Unknown min(nodeCount, maxReplicas), floor 1

WithTopologyAwareSchedulingHook(appLabelValue, maxSurge) — Sets rolling update strategy and pod anti-affinity based on replica count (must run after the replicas hook):

replicas > 1: required anti-affinity on kubernetes.io/hostname, maxUnavailable = max(replicas-1, 1)

replicas == 1: nil affinity, maxUnavailable = 1

WithControlPlaneNodeSelectorHook(infrastructureLister) — Adds a control-plane node selector, except on External topology (HyperShift) where no control-plane nodes exist.

Usage example (KSVMO)
deploymentcontroller.WithTopologyAwareReplicasHook(infraLister, nodeLister, 3),
deploymentcontroller.WithControlPlaneNodeSelectorHook(infraLister),
deploymentcontroller.WithTopologyAwareSchedulingHook("migrator", 1),
Cross-operator survey

These hooks codify the pattern already used by most HA operators:

Operator Replicas Anti-affinity maxUnavailable

oauth-openshift =masterCount Required max(count-1, 1)

metrics-server 2 Required 1

image-registry 2 Required 1

console 2 Required 1

KSVMO (current) count CP nodes Preferred 0

The hooks are designed for operators using the DeploymentController pattern. Operators with genuinely custom logic (ingress: hostNetwork; image-registry: platform-driven) are unaffected.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

coderabbitai · 2026-06-04T16:56:35Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: b67794d4-15f8-45fe-a9d2-ec53aaf425bc

📥 Commits

Reviewing files that changed from the base of the PR and between c6e8028 and 43b4ab6.

📒 Files selected for processing (2)

pkg/operator/deploymentcontroller/topology.go
pkg/operator/deploymentcontroller/topology_test.go

Walkthrough

Adds topology-aware deployment hooks: one computes replicas from Infrastructure topology or control-plane nodes, one applies or removes the control-plane nodeSelector based on topology, and one sets rolling-update limits and required hostname pod anti-affinity. Includes extensive unit tests and test helpers.

Changes

Topology-aware deployment hooks

Layer / File(s)	Summary
Topology-aware replicas `pkg/operator/deploymentcontroller/topology.go`, `pkg/operator/deploymentcontroller/topology_test.go`	`WithTopologyAwareReplicasHook` builds a control-plane selector, reads cluster `Infrastructure`, and sets `deployment.Spec.Replicas` deterministically for known topologies or from control-plane node count clamped to `[1,maxReplicas]`. Tests validate modes, external defaults, HA capping, and fallbacks.
Control-plane node selector enforcement `pkg/operator/deploymentcontroller/topology.go`, `pkg/operator/deploymentcontroller/topology_test.go`	`WithControlPlaneNodeSelectorHook` reads `Infrastructure` topology and sets or removes the `node-role.kubernetes.io/control-plane` nodeSelector (removes key and nils map for ExternalTopologyMode). Tests assert selector presence, removal, and preservation of unrelated selectors.
Rolling update and anti-affinity configuration `pkg/operator/deploymentcontroller/topology.go`, `pkg/operator/deploymentcontroller/topology_test.go`	`WithTopologyAwareSchedulingHook` and `SetTopologyAwareScheduling` ensure `Strategy.RollingUpdate` exists, set `MaxUnavailable = max(replicas-1,1)` and `MaxSurge = maxSurge`, and add required PodAntiAffinity on `kubernetes.io/hostname` with `app=<value>` when `replicas>1`, or clear affinity for `replicas==1`. Tests verify values and affinity preservation/clearing.
Test helpers `pkg/operator/deploymentcontroller/topology_test.go`	Fake `NodeLister`/`InfrastructureLister` builders and constructors for control-plane nodes and `Infrastructure` objects used across the test suites.

Sequence Diagram

sequenceDiagram
  participant Controller as DeploymentController
  participant Infra as InfrastructureLister
  participant Nodes as NodeLister
  participant Deployment as Deployment
  Controller->>Infra: Get cluster Infrastructure
  Infra-->>Controller: ControlPlaneTopology
  Controller->>Nodes: List nodes (control-plane selector)
  Nodes-->>Controller: node list/count
  Controller->>Deployment: set Spec.Replicas / update Strategy.RollingUpdate / adjust Affinity / modify NodeSelector

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

🚥 Pre-merge checks | ✅ 14 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 33.33% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (14 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately summarizes the main change: adding topology-aware scheduling hooks to the deployment controller.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	All test case names are static, descriptive strings with no dynamic information. Standard Go testing format with stable test titles.
Test Structure And Quality	✅ Passed	The test file uses standard Go testing, not Ginkgo. The custom check specifically requires "Ginkgo test code," making it inapplicable to this PR.
Microshift Test Compatibility	✅ Passed	No Ginkgo e2e tests added. The PR adds 6 standard Go unit tests in topology_test.go; the check only applies to Ginkgo e2e tests (It, Describe, Context, When patterns).
Single Node Openshift (Sno) Test Compatibility	✅ Passed	No Ginkgo e2e tests were added in this PR. The new test file (topology_test.go) contains only standard Go unit tests using testing.T, not Ginkgo e2e tests. Custom check does not apply.
Topology-Aware Scheduling Compatibility	✅ Passed	Adds topology-aware hooks with safe anti-affinity+maxUnavailable≥1, correct replicas for all topologies, and proper External topology control-plane selector handling.
Ote Binary Stdout Contract	✅ Passed	No process-level stdout writes found. PR adds topology-aware scheduling hooks and tests with no klog, fmt.Print calls, or init/TestMain functions that could emit non-JSON stdout.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	Tests added are standard Go unit tests (not Ginkgo e2e), use fake Kubernetes listers, contain no IPv4 addresses/assumptions, and require no external connectivity.
No-Weak-Crypto	✅ Passed	No weak cryptography found. PR introduces topology-aware scheduling hooks with no crypto imports, no weak algorithms, no custom crypto implementations, and no insecure token comparisons.
Container-Privileges	✅ Passed	PR contains only Go source code (topology.go and topology_test.go), no K8s manifests or container specs; privilege check not applicable.
No-Sensitive-Data-In-Logs	✅ Passed	No sensitive data logged; only generic infrastructure error messages and test assertions with non-sensitive Kubernetes topology metadata.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-06-04T16:57:23Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sanchezl
Once this PR has been reviewed and has the lgtm label, please assign p0lyn0mial for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

pkg/operator/OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/operator/deploymentcontroller/topology_test.go`:
- Around line 20-22: Tests currently ignore errors returned by indexer.Add when
adding entries from nodes; update the test code in topology_test.go to check
those errors and fail the test if any occur (e.g., replace `_ = indexer.Add(n)`
with error capture and call t.Fatalf or use require.NoError/assert.NoError),
doing the same for the other occurrences around lines where nodes are added so
broken fixtures surface instead of being masked.

In `@pkg/operator/deploymentcontroller/topology.go`:
- Around line 93-96: Replace the hard required hostname anti-affinity with a
preferred (soft) rule: remove the RequiredDuringSchedulingIgnoredDuringExecution
entry that contains the corev1.PodAffinityTerm with TopologyKey
"kubernetes.io/hostname" and instead add a
PreferredDuringSchedulingIgnoredDuringExecution slice containing a
metav1.WeightedPodAffinityTerm (with an appropriate weight, e.g. 50) whose
PodAffinityTerm uses the same LabelSelector and TopologyKey; update the
surrounding struct where pod affinity is built so it uses
PreferredDuringSchedulingIgnoredDuringExecution and WeightedPodAffinityTerm
instead of RequiredDuringSchedulingIgnoredDuringExecution and
corev1.PodAffinityTerm.
- Around line 117-133: The hook WithControlPlaneNodeSelectorHook currently
always injects a control-plane nodeSelector
(deployment.Spec.Template.Spec.NodeSelector["node-role.kubernetes.io/control-plane"]=""),
which hard-pins workloads; change it to be opt-in: do not modify NodeSelector by
default—only add the control-plane nodeSelector when an explicit opt-in signal
is present (e.g., a deployment annotation like
"operator.openshift.io/enforce-control-plane-node-selector" == "true" or a
boolean flag in the passed OperatorSpec). Update
WithControlPlaneNodeSelectorHook to check that opt-in signal (and still skip for
ExternalTopologyMode) before creating or mutating
deployment.Spec.Template.Spec.NodeSelector, and document/flag this behavior so
callers must explicitly enable it.
- Around line 126-133: When infra.Status.ControlPlaneTopology equals
configv1.ExternalTopologyMode we currently return early but do not remove any
existing "node-role.kubernetes.io/control-plane" entry, which can leave pods
stuck Pending; update the conditional path in the same function to check
deployment.Spec.Template.Spec.NodeSelector and if non-nil (or initialize it)
delete the "node-role.kubernetes.io/control-plane" key (i.e., remove
deployment.Spec.Template.Spec.NodeSelector["node-role.kubernetes.io/control-plane"])
before returning so any pre-existing control-plane selector is cleared for
External topology.
- Around line 90-111: The code currently overwrites
deployment.Spec.Template.Spec.Affinity (and sets it to nil) which destroys any
pre-existing NodeAffinity/PodAffinity; instead, update only the
PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution field while
preserving other affinity subfields: ensure
deployment.Spec.Template.Spec.Affinity is initialized if nil, keep existing
Affinity.NodeAffinity and Affinity.PodAffinity intact, and set or clear
Affinity.PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution based on
the replicas check; reference the Affinity object on
deployment.Spec.Template.Spec.Affinity and its
PodAntiAffinity.RequiredDuringSchedulingIgnoredDuringExecution, PodAffinity, and
NodeAffinity when implementing the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 67005b98-452d-4979-9a0c-7438c7888843

📥 Commits

Reviewing files that changed from the base of the PR and between ba2b9a3 and 7acaff6.

📒 Files selected for processing (2)

pkg/operator/deploymentcontroller/topology.go
pkg/operator/deploymentcontroller/topology_test.go

coderabbitai · 2026-06-04T17:04:16Z

+					RequiredDuringSchedulingIgnoredDuringExecution: []corev1.PodAffinityTerm{
+						{
+							TopologyKey: "kubernetes.io/hostname",
+							LabelSelector: &metav1.LabelSelector{


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid required hostname pod anti-affinity in controller defaults.

Line 93-96 configures required anti-affinity with kubernetes.io/hostname, which can create strict scheduling failures in constrained/non-HA topologies. Prefer a soft rule (PreferredDuringSchedulingIgnoredDuringExecution) if this behavior is still needed.

As per coding guidelines: deployment/controller code must “flag required pod anti-affinity with hostname topology keys”.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/operator/deploymentcontroller/topology.go` around lines 93 - 96, Replace the hard required hostname anti-affinity with a preferred (soft) rule: remove the RequiredDuringSchedulingIgnoredDuringExecution entry that contains the corev1.PodAffinityTerm with TopologyKey "kubernetes.io/hostname" and instead add a PreferredDuringSchedulingIgnoredDuringExecution slice containing a metav1.WeightedPodAffinityTerm (with an appropriate weight, e.g. 50) whose PodAffinityTerm uses the same LabelSelector and TopologyKey; update the surrounding struct where pod affinity is built so it uses PreferredDuringSchedulingIgnoredDuringExecution and WeightedPodAffinityTerm instead of RequiredDuringSchedulingIgnoredDuringExecution and corev1.PodAffinityTerm.

Required anti-affinity is intentional here — it is the core purpose of this PR. The topology-aware replicas hook (which runs first in the chain) already adapts to constrained topologies: SNO gets 1 replica (so anti-affinity is removed entirely), DualReplica/TNA gets 2 replicas on 2 nodes, etc. The deadlock that required anti-affinity normally causes on constrained topologies is avoided because the companion scheduling hook sets maxUnavailable = max(replicas-1, 1), allowing rolling updates to proceed by deleting an old pod to free a node. This matches the pattern used by every other HA operator in OpenShift (oauth-openshift, openshift-apiserver, openshift-controller-manager, image-registry).

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Resolve the coderabbit review comment using the GitHub CLI # First, list comments on the PR to find the comment ID gh api repos/openshift/library-go/pulls/2276/comments --jq '.[] | select(.user.login == "coderabbitai[bot]") | {id: .id, body: .body[0:80]}'

Length of output: 978

🏁 Script executed:

#!/bin/bash # Resolve the review comment via GraphQL (minimize it) # First get the node ID for comment 3357674013 gh api graphql -f query=' { repository(owner: "openshift", name: "library-go") { pullRequest(number: 2276) { reviewThreads(first: 50) { nodes { id isResolved comments(first: 1) { nodes { databaseId body } } } } } } }' --jq '.data.repository.pullRequest.reviewThreads.nodes[] | select(.comments.nodes[0].databaseId == 3357674013) | {threadId: .id, resolved: .isResolved}'

Length of output: 716

🏁 Script executed:

#!/bin/bash # Resolve the review thread gh api graphql -f query=' mutation { resolveReviewThread(input: {threadId: "PRRT_kwDOB-wqks6HJQQ6"}) { thread { id isResolved } } }'

Length of output: 705

@sanchezl, understood — the design is intentional and the concern is fully mitigated. The two hooks work as a system:

Replicas hook (runs first): adapts replica count to topology — SNO gets 1 replica, so WithTopologyAwareSchedulingHook removes anti-affinity entirely for that case.

Scheduling hook (runs second): sets maxUnavailable = max(replicas-1, 1), which prevents the classic scheduling deadlock by allowing Kubernetes to evict an old pod to free a node before placing the replacement.

This is the established pattern across OpenShift HA operators (oauth-openshift, openshift-apiserver, openshift-controller-manager, image-registry), and required anti-affinity is the correct choice here. I'll mark the comment as resolved.

✏️ Learnings added

Learnt from: sanchezl Repo: openshift/library-go PR: 2276 File: pkg/operator/deploymentcontroller/topology.go:0-0 Timestamp: 2026-06-04T17:50:01.796Z Learning: In `pkg/operator/deploymentcontroller/topology.go` (openshift/library-go PR `#2276`), `RequiredDuringSchedulingIgnoredDuringExecution` pod anti-affinity on `kubernetes.io/hostname` is intentional and safe. The design relies on two hooks in tandem: (1) `WithTopologyAwareReplicasHook` sets replicas=1 for SNO (so `WithTopologyAwareSchedulingHook` removes anti-affinity entirely for SNO), and replicas=2 for DualReplica/TNA/External; (2) `WithTopologyAwareSchedulingHook` sets `maxUnavailable = max(replicas-1, 1)`, avoiding the typical scheduling deadlock by allowing Kubernetes to evict an old pod before scheduling the replacement. This matches the pattern used by oauth-openshift, openshift-apiserver, openshift-controller-manager, and image-registry. Do not flag required hostname anti-affinity as a violation in this file.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: CR Repo: openshift/library-go PR: 0 File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0 Timestamp: 2026-05-27T15:26:58.336Z Learning: Applies to **/*.{yaml,yml,go} : Deployment manifests and controller code must avoid scheduling constraints that assume standard HA topology. Flag required pod anti-affinity with hostname topology keys, pod topology spread constraints with DoNotSchedule, replica counts derived from node count without topology awareness, nodeSelector or affinity targeting control-plane nodes, broad tolerations that schedule to arbiter nodes, code targeting only worker nodes, and PodDisruptionBudgets designed for 3+ nodes

Learnt from: CR Repo: openshift/library-go PR: 0 File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0 Timestamp: 2026-05-27T15:26:58.336Z Learning: Applies to **/*_test.go : When new Ginkgo e2e tests are added, check whether they make multi-node or HA assumptions. Flag tests expecting multiple control-plane/master nodes, multiple worker nodes, pod anti-affinity, node-to-node communication, leader election failover, pod rescheduling, node scaling, separate infra/worker/master roles, rolling updates requiring other nodes, or multi-endpoint ingress/load balancing

coderabbitai · 2026-06-04T17:04:16Z

+// WithControlPlaneNodeSelectorHook adds a control-plane node selector to the
+// deployment, except on External topology (HyperShift) where no control-plane
+// nodes exist in-cluster and the selector would leave pods Pending.
+func WithControlPlaneNodeSelectorHook(infrastructureLister configv1listers.InfrastructureLister) DeploymentHookFunc {
+	return func(_ *opv1.OperatorSpec, deployment *appsv1.Deployment) error {
+		infra, err := infrastructureLister.Get("cluster")
+		if err != nil {
+			return fmt.Errorf("failed to get infrastructure resource: %w", err)
+		}
+		if infra.Status.ControlPlaneTopology == configv1.ExternalTopologyMode {
+			return nil
+		}
+		if deployment.Spec.Template.Spec.NodeSelector == nil {
+			deployment.Spec.Template.Spec.NodeSelector = make(map[string]string)
+		}
+		deployment.Spec.Template.Spec.NodeSelector["node-role.kubernetes.io/control-plane"] = ""
+		return nil


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Avoid enforcing control-plane nodeSelector in shared hook defaults.

Line 132 hard-pins workloads to control-plane nodes. This violates the topology-safety policy and risks portability issues across non-standard topologies. Consider making this opt-in at the caller layer instead of default behavior in a shared hook.

As per coding guidelines: deployment/controller code must “flag nodeSelector or affinity targeting control-plane nodes”.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/operator/deploymentcontroller/topology.go` around lines 117 - 133, The hook WithControlPlaneNodeSelectorHook currently always injects a control-plane nodeSelector (deployment.Spec.Template.Spec.NodeSelector["node-role.kubernetes.io/control-plane"]=""), which hard-pins workloads; change it to be opt-in: do not modify NodeSelector by default—only add the control-plane nodeSelector when an explicit opt-in signal is present (e.g., a deployment annotation like "operator.openshift.io/enforce-control-plane-node-selector" == "true" or a boolean flag in the passed OperatorSpec). Update WithControlPlaneNodeSelectorHook to check that opt-in signal (and still skip for ExternalTopologyMode) before creating or mutating deployment.Spec.Template.Spec.NodeSelector, and document/flag this behavior so callers must explicitly enable it.

This hook is opt-in — callers explicitly register it in their controller's hook chain. It is not applied by default to any deployment. The caller decides whether their workload should run on control-plane nodes. Additionally, the External topology case (where no CP nodes exist) is handled by clearing any pre-existing CP selector, as addressed in the latest push.

@sanchezl, you're right — that's a fair clarification. Since callers must explicitly register WithControlPlaneNodeSelectorHook in their hook chain, the opt-in contract is clear at the usage site. The hook name also makes the intent self-documenting. I'll withdraw this concern.

🧠 Learnings used

Learnt from: CR Repo: openshift/library-go PR: 0 File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0 Timestamp: 2026-05-27T15:26:58.336Z Learning: Applies to **/*.{yaml,yml,go} : Deployment manifests and controller code must avoid scheduling constraints that assume standard HA topology. Flag required pod anti-affinity with hostname topology keys, pod topology spread constraints with DoNotSchedule, replica counts derived from node count without topology awareness, nodeSelector or affinity targeting control-plane nodes, broad tolerations that schedule to arbiter nodes, code targeting only worker nodes, and PodDisruptionBudgets designed for 3+ nodes

Learnt from: CR Repo: openshift/library-go PR: 0 File: coderabbit-custom-pre-merge-checks-unique-id-file-non-traceable-F7F2B60C-1728-4C9A-8889-4F2235E186CA.txt:0-0 Timestamp: 2026-05-27T15:26:58.336Z Learning: Applies to **/*_test.go : When new Ginkgo e2e tests are added, check whether they make multi-node or HA assumptions. Flag tests expecting multiple control-plane/master nodes, multiple worker nodes, pod anti-affinity, node-to-node communication, leader election failover, pod rescheduling, node scaling, separate infra/worker/master roles, rolling updates requiring other nodes, or multi-endpoint ingress/load balancing

Add three DeploymentHookFunc helpers for operators that manage Deployments via the library-go DeploymentController: - WithTopologyAwareReplicasHook: sets replicas based on ControlPlaneTopology (SingleReplica, DualReplica, HighlyAvailableArbiter, External, HighlyAvailable) with a caller-specified cap, falling back to control-plane node count for unknown topologies. - WithTopologyAwareSchedulingHook: sets required pod anti-affinity and rolling update strategy (maxUnavailable, maxSurge) based on the deployment's replica count. When replicas > 1, pods are guaranteed to spread across nodes via required anti-affinity and maxUnavailable allows rolling updates to proceed without deadlock. When replicas == 1 (SNO), anti-affinity is removed. - WithControlPlaneNodeSelectorHook: adds a control-plane node selector, skipping External topology where no control-plane nodes exist in-cluster.

openshift-ci · 2026-06-06T15:30:17Z

@sanchezl: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 4, 2026

openshift-ci Bot requested review from bertinatto and hexfusion June 4, 2026 16:57

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

sanchezl mentioned this pull request Jun 4, 2026

Fix topology scheduling rule: anti-affinity, feature gates, PDB guidance openshift/coderabbit#18

Merged

sanchezl force-pushed the topology-aware-scheduling-hooks branch from 78e4348 to c6e8028 Compare June 6, 2026 14:55

sanchezl force-pushed the topology-aware-scheduling-hooks branch from c6e8028 to 43b4ab6 Compare June 6, 2026 15:20

Conversation

sanchezl commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

New APIs

Pure functions (for any operator, no DeploymentController required)

DeploymentHookFunc hooks (for DeploymentController users)

Convenience constructor

Usage examples

Cross-operator survey

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented Jun 4, 2026

Summary

New hooks

Usage example (KSVMO)

Cross-operator survey

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

openshift-ci Bot commented Jun 4, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sanchezl Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sanchezl Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

openshift-ci Bot commented Jun 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sanchezl commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

coderabbitai Bot Jun 4, 2026 •

edited

Loading

coderabbitai Bot Jun 4, 2026 •

edited

Loading