Skip to content

NO-JIRA: kms rotation prototype#2297

Draft
tjungblu wants to merge 3 commits into
openshift:masterfrom
tjungblu:kms_rotation_annotation
Draft

NO-JIRA: kms rotation prototype#2297
tjungblu wants to merge 3 commits into
openshift:masterfrom
tjungblu:kms_rotation_annotation

Conversation

@tjungblu

@tjungblu tjungblu commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

/hold

Summary by CodeRabbit

  • New Features

    • Added a KMS KEK rotation controller, automated KEK promotion and migration handling, and cluster convergence tracking via a ConfigMap-backed reporter
    • Extended encryption state and secret handling to track KEK migration annotations and convergence timing
  • Bug Fixes

    • Prevents minting a new encryption key while a KEK migration is in progress
  • Tests

    • Added unit tests covering rotation controller, migration flows, secret helpers, and convergence reporter

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 11, 2026
@openshift-ci-robot

Copy link
Copy Markdown

@tjungblu: This pull request explicitly references no jira issue.

Details

In response to this:

/hold

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Jun 11, 2026
@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@coderabbitai

coderabbitai Bot commented Jun 11, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d46130d6-02fc-4338-9780-ae88118e20d7

📥 Commits

Reviewing files that changed from the base of the PR and between e3c679d and 374bd67.

📒 Files selected for processing (2)
  • pkg/operator/encryption/controllers/kms_rotation_controller.go
  • pkg/operator/encryption/controllers/kms_rotation_controller_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
  • pkg/operator/encryption/controllers/kms_rotation_controller_test.go
  • pkg/operator/encryption/controllers/kms_rotation_controller.go

Walkthrough

Adds KEK migration annotations/types, secret helpers, a ConfigMap-backed converged-KEK reporter, a new KMS rotation controller that reconciles KEK annotations on write-key secrets, and guards/wiring to block key operations while KEK migration is in flight.

Changes

KMS KEK Rotation Feature

Layer / File(s) Summary
Annotation constants and KEK migration types
pkg/operator/encryption/secrets/types.go, pkg/operator/encryption/state/types.go
Introduces annotation keys for target/migrated KEK IDs and convergence tracking (timestamp and candidate ID) and a KekConvergenceDelay constant; adds KekMigrationState to KeyState with NeedsKekMigration().
KEK extraction and predicate utilities
pkg/operator/encryption/secrets/kek.go, pkg/operator/encryption/secrets/secrets.go, pkg/operator/encryption/secrets/kek_test.go
Parses KEK migration annotations from secrets, provides NeedsKekMigration and MigrationWriteKey, and populates KeyState.KekMigration in ToKeyState. Tests validate parsing, predicate logic, and write-key naming.
ConfigMap-backed convergence reporter
pkg/operator/encryption/kms/health/configmap_reporter.go, pkg/operator/encryption/kms/health/configmap_reporter_test.go
Adds MOCK_ConfigMapConvergedKEKReporter and ConvergedKekFromConfigMap parser that reads converged-kek-id and optional converged flag from a ConfigMap; tests cover nil/empty data, flags, trimming, and lister lookup.
KMS rotation controller and annotation reconciliation
pkg/operator/encryption/controllers/kms_rotation_controller.go, pkg/operator/encryption/controllers/kms_rotation_controller_test.go
New ConvergedKEKReporter interface and NewKMSRotationController. Controller selects latest KMS write-key secret, queries convergence, and reconciles bootstrap, convergence clock (RFC3339), promotion after delay, and clears annotations. Uses optimistic retry updates. Tests cover mutators and time-dependent reconcile scenarios.
Migration controller: write-key selection and migrated KEK tracking
pkg/operator/encryption/controllers/migration_controller.go
Derives a write-key secret for current state, skips resources without write keys, rewrites destination write-key when KEK migration is needed, tracks kekMigrationComplete, and sets migrated-kek-id on the write-key secret with conflict-retry semantics.
Integration guards and controller wiring
pkg/operator/encryption/controllers.go, pkg/operator/encryption/controllers/key_controller.go, pkg/operator/encryption/controllers/key_controller_test.go, pkg/operator/encryption/statemachine/transition.go, pkg/operator/encryption/statemachine/transition_test.go
Wires the KMS rotation controller into the operator factory using the mock ConfigMap reporter; adds guard in needsNewKey to avoid minting new keys during KEK migration; statemachine STEP 4 early-exit prevents read-key pruning while migration is in flight; tests added/updated to assert no-op behavior during migration.

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 13 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 28.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'NO-JIRA: kms rotation prototype' is vague and generic, using 'prototype' without clearly conveying the specific implementation details or primary changes made in this substantial pull request. Revise the title to be more specific about the main change, such as 'Add KMS KEK rotation controller and migration support' or similar, that better describes the primary functionality being introduced.
✅ Passed checks (13 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names in the PR are stable and deterministic. None of the test files use Ginkgo (the check's scope). All 7 test functions and 13 subtests/table-driven test cases use static, descriptive na...
Test Structure And Quality ✅ Passed The custom check requires review of Ginkgo test code, but all test files in this PR use standard Go testing (func TestXXX(t *testing.T)) with table-driven patterns, not Ginkgo framework. Check not...
Microshift Test Compatibility ✅ Passed PR adds only Go unit tests (TestXxx functions), not Ginkgo e2e tests. The check is designed for Ginkgo e2e tests and is not applicable here.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR adds only standard Go unit tests (testing.T), not Ginkgo e2e tests. The SNO compatibility check is not applicable as no Ginkgo tests are introduced.
Topology-Aware Scheduling Compatibility ✅ Passed PR introduces only KMS rotation controller logic and helper utilities for encryption state management. No deployment manifests, pod specs, affinity rules, nodeSelectors, replica logic, or any sched...
Ote Binary Stdout Contract ✅ Passed PR does not violate OTE Binary Stdout Contract: no main(), init(), TestMain(), BeforeSuite(), or AfterSuite() in modified files; all klog calls are inside runtime functions, not process-level code.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed PR adds only standard Go unit tests (func Test*(t *testing.T)), not Ginkgo e2e tests. Check applies only to Ginkgo e2e tests, so not applicable.
No-Weak-Crypto ✅ Passed No weak cryptographic algorithms (MD5, SHA1, DES, RC4, 3DES, Blowfish, ECB), custom crypto implementations, or insecure secret comparisons found. String comparisons are of KEK ID identifiers, not c...
Container-Privileges ✅ Passed PR adds only 2 encryption-related manifest files: kms-preflight-pod.yaml (a simple Pod with basic resource requests, no privileged settings) and k8s_mock_kms_plugin_configmap.yaml (a ConfigMap, not...
No-Sensitive-Data-In-Logs ✅ Passed Logging statements log operational metadata (KEK ID identifiers, Kubernetes resource names) appropriate for KMS rotation debugging; no actual encryption keys, tokens, credentials, or sensitive PII...

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/operator/encryption/controllers/kms_rotation_controller.go`:
- Around line 164-174: The current logic can promote a KEK immediately if
kekMigration.KekConvergedAt is zero; add a zero-time guard before the delay
check: when convergedKekID == kekMigration.KekConvergedID, first test if
kekMigration.KekConvergedAt.IsZero() and if so call updateWriteKeySecret with
setKekConvergenceClock(...) to record c.now(); otherwise perform the existing
time-difference check (c.now().Sub(kekMigration.KekConvergedAt) >=
secrets.KekConvergenceDelay) and only then call
promoteConvergedKekToTarget(...). Ensure you use the same functions
(updateWriteKeySecret, setKekConvergenceClock, promoteConvergedKekToTarget) and
fields (convergedKekID, kekMigration.KekConvergedAt) from the diff.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 802cafc8-5bbd-4860-b024-4d7c8decbaaa

📥 Commits

Reviewing files that changed from the base of the PR and between 7fd5f33 and 09b25bc.

📒 Files selected for processing (15)
  • pkg/operator/apiserver/controllerset/apiservercontrollerset.go
  • pkg/operator/encryption/controllers.go
  • pkg/operator/encryption/controllers/key_controller.go
  • pkg/operator/encryption/controllers/key_controller_test.go
  • pkg/operator/encryption/controllers/kms_rotation_controller.go
  • pkg/operator/encryption/controllers/kms_rotation_controller_test.go
  • pkg/operator/encryption/controllers/migration_controller.go
  • pkg/operator/encryption/secrets/kek.go
  • pkg/operator/encryption/secrets/kek_test.go
  • pkg/operator/encryption/secrets/secrets.go
  • pkg/operator/encryption/secrets/types.go
  • pkg/operator/encryption/state/types.go
  • pkg/operator/encryption/statemachine/transition.go
  • pkg/operator/encryption/statemachine/transition_test.go
  • test/e2e-encryption/encryption_test.go

Comment thread pkg/operator/encryption/controllers/kms_rotation_controller.go

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/operator/apiserver/controllerset/apiservercontrollerset.go`:
- Around line 406-410: The fluent setter
WithMOCK_ConfigMapConvergedKEKReporterForEncryptionControllers currently
dereferences cs.encryptionControllers by calling ConfigMapLister() eagerly;
instead, change the setter to only store the provided configMapName (and a flag)
on the APIServerControllerSet/encryptionControllerBuilder state and remove the
immediate call to kmshealth.NewMOCK_ConfigMapConvergedKEKReporter, then move
construction of the mock reporter into encryptionControllerBuilder.build() where
encryptionControllers and its informer listers are guaranteed to be initialized;
update build() to call
kmshealth.NewMOCK_ConfigMapConvergedKEKReporter(cs.encryptionControllers.kubeInformersForNamespaces.ConfigMapLister(),
storedConfigMapName) and assign the result to
encryptionControllers.convergedKEKReporter.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 009fd4e6-24f1-4242-98ec-7a8af91b7bb1

📥 Commits

Reviewing files that changed from the base of the PR and between 09b25bc and ea13d22.

📒 Files selected for processing (5)
  • pkg/operator/apiserver/controllerset/apiservercontrollerset.go
  • pkg/operator/encryption/controllers.go
  • pkg/operator/encryption/controllers/kms_rotation_controller.go
  • pkg/operator/encryption/kms/health/configmap_reporter.go
  • pkg/operator/encryption/kms/health/configmap_reporter_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • pkg/operator/encryption/controllers.go

Comment on lines +406 to +410
func (cs *APIServerControllerSet) WithMOCK_ConfigMapConvergedKEKReporterForEncryptionControllers(configMapName string) *APIServerControllerSet {
cs.encryptionControllers.convergedKEKReporter = kmshealth.NewMOCK_ConfigMapConvergedKEKReporter(
cs.encryptionControllers.kubeInformersForNamespaces.ConfigMapLister(),
configMapName,
)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Avoid eager informer dereference in the fluent setter.

WithMOCK_ConfigMapConvergedKEKReporterForEncryptionControllers eagerly calls ConfigMapLister(). If this setter is invoked before WithEncryptionControllers, setup can panic on a zero-value informer holder. Defer mock reporter construction to encryptionControllerBuilder.build() and store only configMapName in the builder state.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/operator/apiserver/controllerset/apiservercontrollerset.go` around lines
406 - 410, The fluent setter
WithMOCK_ConfigMapConvergedKEKReporterForEncryptionControllers currently
dereferences cs.encryptionControllers by calling ConfigMapLister() eagerly;
instead, change the setter to only store the provided configMapName (and a flag)
on the APIServerControllerSet/encryptionControllerBuilder state and remove the
immediate call to kmshealth.NewMOCK_ConfigMapConvergedKEKReporter, then move
construction of the mock reporter into encryptionControllerBuilder.build() where
encryptionControllers and its informer listers are guaranteed to be initialized;
update build() to call
kmshealth.NewMOCK_ConfigMapConvergedKEKReporter(cs.encryptionControllers.kubeInformersForNamespaces.ConfigMapLister(),
storedConfigMapName) and assign the result to
encryptionControllers.convergedKEKReporter.

Signed-off-by: Thomas Jungblut <[email protected]>
@tjungblu tjungblu force-pushed the kms_rotation_annotation branch from ea13d22 to e3c679d Compare June 11, 2026 11:38
@openshift-ci

openshift-ci Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tjungblu
Once this PR has been reviewed and has the lgtm label, please assign dgrisonnet for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@pkg/operator/encryption/controllers.go`:
- Around line 53-56: The code currently wires the test-only MOCK reporter
(NewMOCK_ConfigMapConvergedKEKReporter) into the production path used by
NewKMSRotationController and exposes ConvergedKekID(), so replace or gate the
mock: detect production runtime (env/config flag or build tag) and only
instantiate NewMOCK_ConfigMapConvergedKEKReporter for non-production/local;
otherwise construct and pass the real ConvergedKEKReporter (or return an
error/fallback that prevents using the mock) to NewKMSRotationController. Update
the convergedKEKReporter creation site to branch on that condition and ensure
ConvergedKekID() calls resolve against the real reporter in production.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 91c6cd51-666f-42da-8816-a2eb8cd28f65

📥 Commits

Reviewing files that changed from the base of the PR and between ea13d22 and e3c679d.

📒 Files selected for processing (4)
  • pkg/operator/encryption/controllers.go
  • pkg/operator/encryption/controllers/kms_rotation_controller.go
  • pkg/operator/encryption/kms/health/configmap_reporter.go
  • pkg/operator/encryption/kms/health/configmap_reporter_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
  • pkg/operator/encryption/kms/health/configmap_reporter_test.go
  • pkg/operator/encryption/kms/health/configmap_reporter.go
  • pkg/operator/encryption/controllers/kms_rotation_controller.go

Comment thread pkg/operator/encryption/controllers.go
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants