feat(controllers): skip reconciliation for terminals on hibernated shoots#512
feat(controllers): skip reconciliation for terminals on hibernated shoots#512petersutter wants to merge 3 commits into
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
📝 WalkthroughWalkthroughAdds hibernation-aware Terminal reconciliation. The controller now indexes Terminals by referenced Shoots, watches Shoot hibernation wake-up transitions, applies Shoot cache transforms and RBAC updates, and changes reconcile and delete status handling to wait for hibernated Shoots. ChangesTerminal hibernation reconciliation
Sequence Diagram(s)Wake-up routing sequenceDiagram
participant Shoot as gardencorev1beta1.Shoot
participant Predicate as shootWakeUpPredicate
participant Mapper as mapShootToTerminals
participant Index as TerminalShootRef
participant Controller as TerminalController
Shoot->>Predicate: update event with IsHibernated true→false
Predicate-->>Mapper: accept update
Mapper->>Index: list Terminal objects by Shoot reference
Index-->>Mapper: matching Terminal keys
Mapper-->>Controller: enqueue reconcile requests
Hibernated delete flow sequenceDiagram
participant DeleteTerminal as deleteTerminal
participant ClientSet as newClientSetFromClusterCredentials
participant HibernationCheck as isShootRefHibernated
DeleteTerminal->>ClientSet: create host and target clients
ClientSet->>HibernationCheck: read referenced Shoot status
HibernationCheck-->>ClientSet: errShootHibernated when IsHibernated=true
ClientSet-->>DeleteTerminal: hibernation error
DeleteTerminal->>DeleteTerminal: patchTerminalOperationProcessing
DeleteTerminal->>DeleteTerminal: requeue after shootHibernationRequeueAfter
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
controllers/controller_suite_test.go (1)
112-114: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winMirror the production Shoot cache transform in envtest.
This updates the field index, but the test manager still uses the default Shoot cache while
main.gonow strips cached Shoots down to metadata plusstatus.isHibernated. Without the same transform in tests, accidental reads of stripped Shoot fields will pass here and only fail in production.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@controllers/controller_suite_test.go` around lines 112 - 114, The envtest setup in controller_suite_test.go only registers the Terminal field index and still relies on the default Shoot cache behavior, so it should mirror the production cache transform used in main.go. Update the test manager setup to apply the same Shoot cache transform that strips Shoots down to metadata plus status.isHibernated, using the relevant cache/config setup alongside GetFieldIndexer().IndexField so tests catch accidental reads of removed Shoot fields.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@controllers/terminal_controller.go`:
- Around line 303-313: The hibernation wait path in the terminal reconciliation
logic is being treated as a successful reconcile, which causes the status update
to be marked succeeded even when work was skipped. Update the flow in the
Terminal controller’s reconcile path around isAnyReferencedShootHibernated so
that the hibernated case returns a non-successful/skip result or otherwise
bypasses the success patch, while still recording the event and log. Ensure the
unconditional success status update does not run for the hibernated wait branch.
- Around line 116-120: The new Shoot watch in TerminalController needs RBAC for
reading shoots, but the ClusterRole currently only covers projects and
shoots/adminkubeconfig. Update the controller-manager ClusterRole to grant list
and watch on core.gardener.cloud/shoots so the watch in Watches(...,
handler.EnqueueRequestsFromMapFunc(r.mapShootToTerminals),
shootWakeUpPredicate()) can start successfully; also verify whether the shoot
lookup used by TerminalController methods such as mapShootToTerminals or any
direct Shoot fetches needs get permission and add it if required.
In `@controllers/terminal_hibernation_test.go`:
- Around line 833-850: The hibernation test is bypassing the actual Shoot
wake-up path and can pass even if the Shoot watch wiring is broken. Update the
test around the Terminal reconciling logic to trigger wake-up by changing the
Shoot state only, then assert the Terminal is reconciled via the
Shoot-to-Terminal mapping path instead of mutating a Terminal annotation. Use
the existing `IsHibernated` patch flow and the relevant
`mapShootToTerminals`/watch-based reconciliation behavior to verify the wake-up
contract directly.
---
Nitpick comments:
In `@controllers/controller_suite_test.go`:
- Around line 112-114: The envtest setup in controller_suite_test.go only
registers the Terminal field index and still relies on the default Shoot cache
behavior, so it should mirror the production cache transform used in main.go.
Update the test manager setup to apply the same Shoot cache transform that
strips Shoots down to metadata plus status.isHibernated, using the relevant
cache/config setup alongside GetFieldIndexer().IndexField so tests catch
accidental reads of removed Shoot fields.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: df3fa6a1-acab-4193-8243-9710f5acf92c
📒 Files selected for processing (6)
api/v1alpha1/terminal_types.gocontrollers/controller_suite_test.gocontrollers/terminal_controller.gocontrollers/terminal_hibernation_test.gomain.gotest/common.go
e2f223d to
52d5907
Compare
…oots Terminals referencing a hibernated shoot previously failed with credential errors since the shoot API server is down. The controller now checks shoot hibernation status before building cluster clients and short-circuits with a dedicated error. - Watch Shoots via a wake-up predicate (IsHibernated true→false) and re-enqueue referencing Terminals through a field index - During deletion, clean up available clusters and defer hibernated cluster cleanup with a 1h requeue - Strip cached Shoots to metadata + hibernation status via TransformFunc
52d5907 to
8fc1a62
Compare
|
@coderabbitai review |
✅ Action performedReview finished.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@controllers/terminal_controller.go`:
- Around line 304-305: Treat errShootHibernated as a wait-state outcome in
handleRequest and the reconcile path instead of letting it fall through the
generic error handler. Update the TerminalController flow around
newClientSetFromClusterCredentials, handleRequest, and the Hibernated event
handling so hibernated Shoots set the Terminal status appropriately, emit the
promised Hibernated event, and return without surfacing a reconcile error or
triggering controller-runtime retries. Keep the existing errShootHibernated
signal from the host/target client set helpers, but intercept it before the
normal error path.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 221973b0-0ef4-4e12-aa6d-4c03c18ad4a1
📒 Files selected for processing (7)
api/v1alpha1/terminal_types.gocharts/terminal/charts/application/templates/controller-manager/clusterrole.yamlcontrollers/controller_suite_test.gocontrollers/terminal_controller.gocontrollers/terminal_hibernation_test.gomain.gotest/common.go
🚧 Files skipped from review as they are similar to previous changes (4)
- test/common.go
- controllers/controller_suite_test.go
- api/v1alpha1/terminal_types.go
- main.go
| hostClientSet, hostClientSetErr := r.newClientSetFromClusterCredentials(ctx, gardenClientSet, t.Spec.Host.Credentials, cfg.HonourServiceAccountRefHostCluster, cfg.Controllers.Terminal.TokenRequestExpirationSeconds) | ||
| targetClientSet, targetClientSetErr := r.newClientSetFromClusterCredentials(ctx, gardenClientSet, t.Spec.Target.Credentials, cfg.HonourServiceAccountRefTargetCluster, cfg.Controllers.Terminal.TokenRequestExpirationSeconds) |
There was a problem hiding this comment.
🩺 Stability & Availability | 🟠 Major | ⚡ Quick win
Treat hibernated Shoots as a wait state instead of a reconcile error.
These new helpers return errShootHibernated, but the normal reconcile path still bubbles that up through handleRequest's generic error handling. In practice that marks the Terminal as errored, skips the promised Hibernated event on reconcile, and keeps controller-runtime retrying while the Shoot sleeps—exactly the retry pattern this PR is trying to avoid.
Suggested direction
hostClientSet, hostClientSetErr := r.newClientSetFromClusterCredentials(...)
targetClientSet, targetClientSetErr := r.newClientSetFromClusterCredentials(...)
+hostHibernated := isShootHibernatedError(hostClientSetErr)
+targetHibernated := isShootHibernatedError(targetClientSetErr)
+
+if hostHibernated || targetHibernated {
+ r.recordEventAndLog(ctx, t, corev1.EventTypeNormal, extensionsv1alpha1.EventHibernated,
+ "Referenced shoot is hibernated, waiting for wake-up before reconciling")
+ return ctrl.Result{}, r.patchTerminalOperationProcessing(
+ ctx,
+ t,
+ extensionsv1alpha1.LastOperationTypeReconcile,
+ "Terminal reconciliation is waiting for a referenced shoot to wake from hibernation.",
+ )
+}
+
if hostClientSetErr != nil {
...
}Also applies to: 501-516
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@controllers/terminal_controller.go` around lines 304 - 305, Treat
errShootHibernated as a wait-state outcome in handleRequest and the reconcile
path instead of letting it fall through the generic error handler. Update the
TerminalController flow around newClientSetFromClusterCredentials,
handleRequest, and the Hibernated event handling so hibernated Shoots set the
Terminal status appropriately, emit the promised Hibernated event, and return
without surfacing a reconcile error or triggering controller-runtime retries.
Keep the existing errShootHibernated signal from the host/target client set
helpers, but intercept it before the normal error path.
/area robustness
/kind enhancement
What this PR does / why we need it:
When a Shoot referenced by a Terminal's host or target credentials is hibernated, the terminal controller now skips reconciliation and emits a
Hibernatedevent. A Shoot watch with a wake-up predicate (IsHibernated: true → false) automatically re-enqueues affected Terminals when the Shoot comes back.Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Release note:
Summary by CodeRabbit
New Features
Bug Fixes
Replaces #511 (renamed branch due to cc-utils 50-char ref-length limit blocking image push).