Switch ServiceProvider reads in controllers to cached listers#5686
Switch ServiceProvider reads in controllers to cached listers#5686deads2k wants to merge 1 commit into
Conversation
Consumer controllers used to call database.GetOrCreateServiceProviderCluster /
GetOrCreateServiceProviderNodePool on every sync, which both hit Cosmos and
scattered the create path across many sites. Replace those calls with cached
ServiceProvider{Cluster,NodePool}Lister Gets; when the document is absent the
syncer returns nil and is re-enqueued by the existing ServiceProvider*
informer event handlers once the document arrives.
A single dedicated controller per kind owns creation: CreateServiceProviderCluster
and CreateServiceProviderNodePool consult the lister cache first and call
GetOrCreate only on a miss, so the create-with-conflict-fallback pattern lives
in exactly one place per resource.
Co-Authored-By: Claude Opus 4.7 (1M context) <[email protected]>
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: deads2k The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
This PR refactors backend controllers to stop calling database.GetOrCreateServiceProvider{Cluster,NodePool} during steady-state reconciles (which hits Cosmos) and instead read ServiceProvider docs via cached listers. Creation of missing ServiceProvider documents is centralized into two dedicated controllers so the create-with-conflict-fallback pattern lives in one place per resource kind.
Changes:
- Updated validation/upgrade controllers to read ServiceProvider{Cluster,NodePool} via cached listers and bail out on NotFound.
- Added dedicated creator controllers for ServiceProviderCluster and ServiceProviderNodePool to own document creation.
- Updated and extended unit tests and lister test helpers to reflect the new lister-first behavior and resource name constants.
Reviewed changes
Copilot reviewed 19 out of 19 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| backend/pkg/listertesting/db_listers.go | Adds DB-backed ServiceProviderNodePool lister and switches ServiceProviderCluster Get to use the constant resource name. |
| backend/pkg/controllers/validationcontrollers/nodepool_validation_controller.go | Uses ServiceProviderNodePool lister Get instead of GetOrCreate; deep-copies before Replace. |
| backend/pkg/controllers/validationcontrollers/nodepool_validation_controller_test.go | Seeds ServiceProviderNodePool and updates tests to match lister-driven reads. |
| backend/pkg/controllers/validationcontrollers/cluster_validation_controller.go | Uses ServiceProviderCluster lister Get instead of GetOrCreate; deep-copies before Replace. |
| backend/pkg/controllers/upgradecontrollers/trigger_node_pool_upgrade_controller.go | Uses ServiceProviderNodePool lister Get instead of GetOrCreate. |
| backend/pkg/controllers/upgradecontrollers/trigger_control_plane_upgrade_controller.go | Uses ServiceProviderCluster lister Get instead of GetOrCreate. |
| backend/pkg/controllers/upgradecontrollers/nodepool_version_controller.go | Uses ServiceProvider{Cluster,NodePool} lister Gets; deep-copies SPNP before mutations. |
| backend/pkg/controllers/upgradecontrollers/nodepool_version_controller_test.go | Seeds ServiceProvider docs and switches helper logic to GetOrCreate+Replace patterns. |
| backend/pkg/controllers/upgradecontrollers/control_plane_desired_version_controller.go | Uses SPC lister Get; changes admission node-pool prefetch to use SPNP lister and “ready” gating. |
| backend/pkg/controllers/upgradecontrollers/control_plane_desired_version_controller_test.go | Updates test syncer construction to include DB-backed ServiceProvider listers and seeds SP resources in fixtures. |
| backend/pkg/controllers/upgradecontrollers/control_plane_active_version_controller.go | Uses SPC lister Get and deep-copy before Replace to avoid mutating cached objects. |
| backend/pkg/controllers/upgradecontrollers/control_plane_active_version_controller_test.go | Seeds SPC and wires DB-backed SPC lister into tests. |
| backend/pkg/controllers/create_service_provider_nodepool_controller.go | New controller that creates SPNP documents on cache miss, owning GetOrCreate usage. |
| backend/pkg/controllers/create_service_provider_nodepool_controller_test.go | New tests covering SPNP creator behavior across NotFound/error/idempotency cases. |
| backend/pkg/controllers/create_service_provider_cluster_controller.go | New controller that creates SPC documents on cache miss, owning GetOrCreate usage. |
| backend/pkg/controllers/create_service_provider_cluster_controller_test.go | New tests covering SPC creator behavior across NotFound/error/idempotency cases. |
| backend/pkg/controllers/create_nodepool_scoped_read_desires_controller.go | Switches to SPC lister Get (no longer GetOrCreate) to resolve management cluster placement. |
| backend/pkg/controllers/create_cluster_scoped_read_desires_controller.go | Switches to SPC lister Get (no longer GetOrCreate) to resolve management cluster placement. |
| backend/pkg/app/backend.go | Wires new listers into controller constructors and starts the new creator controllers. |
| // Returns (_, false, nil) when any node pool is missing its ServiceProviderNodePool — | ||
| // CreateServiceProviderNodePool will populate it and the controller will be | ||
| // re-enqueued via the ServiceProviderNodePool informer. Skipping admission | ||
| // avoids using stale or missing node-pool version data for skew validation. |
| if database.IsNotFoundError(err) { | ||
| // CreateServiceProviderCluster will populate it; we'll be re-enqueued via the ServiceProviderCluster informer. | ||
| return nil | ||
| } |
| if database.IsNotFoundError(err) { | ||
| // CreateServiceProviderCluster will populate it; we'll be re-enqueued via the ServiceProviderCluster informer. | ||
| return nil | ||
| } |
|
@deads2k: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Consumer controllers used to call database.GetOrCreateServiceProviderCluster / GetOrCreateServiceProviderNodePool on every sync, which both hit Cosmos and scattered the create path across many sites. Replace those calls with cached ServiceProvider{Cluster,NodePool}Lister Gets; when the document is absent the syncer returns nil and is re-enqueued by the existing ServiceProvider* informer event handlers once the document arrives.
A single dedicated controller per kind owns creation: CreateServiceProviderCluster and CreateServiceProviderNodePool consult the lister cache first and call GetOrCreate only on a miss, so the create-with-conflict-fallback pattern lives in exactly one place per resource.