feat(storage): add csi driver health tests#523
Conversation
Signed-off-by: Alexandra Bueno <[email protected]>
📝 WalkthroughWalkthroughAdds a new ChangesCSI Driver Health Check
Estimated code review effort: 3 (Moderate) | ~25 minutes Sequence Diagram(s)sequenceDiagram
participant Suite as k8s_storage Suite
participant Check as K8sCsiDriverHealthCheck
participant K8s as Kubernetes API
participant CSIDriver
participant Workloads as Deployment/DaemonSet
Suite->>Check: run() with configured drivers
Check->>K8s: get storageclass -o json
K8s-->>Check: StorageClass list with provisioners
Check->>Check: group StorageClasses by provisioner
loop for each provisioner
Check->>CSIDriver: verify csidriver registered
CSIDriver-->>Check: registration result
alt workloads configured
Check->>Workloads: get deployment (controller)
Workloads-->>Check: replicas/readyReplicas
Check->>Workloads: get daemonset (node)
Workloads-->>Check: desired/available/ready
else no workloads
Check->>Check: mark controller/node subtests skipped
end
end
Check-->>Suite: subtest results
Related PRs: None identified. Suggested labels: validation, k8s-storage, tests Suggested reviewers: None identified. 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
isvtest/tests/test_k8s_storage.py (1)
2214-2215: 📐 Maintainability & Code Quality | 🔵 Trivial | ⚡ Quick winAdd docstrings to the non-test helpers.
_make()and_workloads()are helper methods, not test entrypoints, so they still need PEP 257 docstrings under the repo's Python rules. As per coding guidelines, "Every function and class must have docstrings following PEP 257".Also applies to: 2254-2264
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@isvtest/tests/test_k8s_storage.py` around lines 2214 - 2215, The non-test helper methods _make() and _workloads() in the K8s CSI driver test helpers are missing required PEP 257 docstrings. Add concise docstrings directly on those helper definitions (and any related helper in the referenced _workloads block) so they comply with the repo rule that every function and class must have docstrings.Source: Coding guidelines
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@isvtest/src/isvtest/validations/k8s_storage.py`:
- Around line 2144-2149: The readiness check in the deployment validation still
enforces full rollout by comparing ready replicas to spec.replicas, which
conflicts with the intended min_controller_replicas threshold. Update the logic
in the Deployment validation path to use min_controller_replicas as the
readiness comparison after the existing desired < min_controller_replicas guard,
and adjust the failure message in that same block to reflect the threshold-based
check using the deployment validation symbols like desired, ready, and
min_replicas.
- Around line 2050-2053: The storage class aggregation in k8s_storage validation
is collapsing duplicate entries too early, causing later workloads for the same
class to be ignored before conflict detection. Update the logic around the
storage_classes loop in the validation path so duplicate StorageClass specs are
preserved until after provisioner grouping, and let the per-provisioner conflict
checks see all entries. Use the sc_to_workloads aggregation and the downstream
conflict-detection code to keep behavior consistent, and add a regression test
covering the same-StorageClass but different workloads/controller-node names
case.
---
Nitpick comments:
In `@isvtest/tests/test_k8s_storage.py`:
- Around line 2214-2215: The non-test helper methods _make() and _workloads() in
the K8s CSI driver test helpers are missing required PEP 257 docstrings. Add
concise docstrings directly on those helper definitions (and any related helper
in the referenced _workloads block) so they comply with the repo rule that every
function and class must have docstrings.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 8630636e-4f84-4cee-aa80-1ead975010e6
📒 Files selected for processing (3)
isvctl/configs/suites/k8s.yamlisvtest/src/isvtest/validations/k8s_storage.pyisvtest/tests/test_k8s_storage.py
| for raw_sc in spec.get("storage_classes") or []: | ||
| sc = str(raw_sc).strip() | ||
| if sc and sc not in sc_to_workloads: | ||
| sc_to_workloads[sc] = workloads |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Don't collapse duplicate StorageClass entries before conflict detection.
Deduping storage_classes here drops later workloads blocks for the same StorageClass, so two specs that point at the same class but different controller/node names silently validate whichever entry appeared first instead of tripping the per-provisioner conflict path. Please preserve duplicate specs until after provisioner grouping, and add a regression test for the same-StorageClass/different-workloads case.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@isvtest/src/isvtest/validations/k8s_storage.py` around lines 2050 - 2053, The
storage class aggregation in k8s_storage validation is collapsing duplicate
entries too early, causing later workloads for the same class to be ignored
before conflict detection. Update the logic around the storage_classes loop in
the validation path so duplicate StorageClass specs are preserved until after
provisioner grouping, and let the per-provisioner conflict checks see all
entries. Use the sc_to_workloads aggregation and the downstream
conflict-detection code to keep behavior consistent, and add a regression test
covering the same-StorageClass but different workloads/controller-node names
case.
| if desired < min_replicas: | ||
| return False, ( | ||
| f"Deployment {namespace}/{name} has spec.replicas={desired} < min_controller_replicas={min_replicas}" | ||
| ) | ||
| if ready != desired: | ||
| return False, (f"Deployment {namespace}/{name}: readyReplicas={ready} != spec.replicas={desired}") |
There was a problem hiding this comment.
🎯 Functional Correctness | 🟠 Major | ⚡ Quick win
Use min_controller_replicas as the readiness threshold.
After the desired < min_controller_replicas guard, the ready != desired check still fails any controller that is above the configured minimum but not fully rolled out. With min_controller_replicas: 1, a 3-replica Deployment with 2 Ready replicas fails even though it satisfies the threshold this option advertises. Compare ready against min_controller_replicas here, or rename the setting/docs if full readiness is the real requirement.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@isvtest/src/isvtest/validations/k8s_storage.py` around lines 2144 - 2149, The
readiness check in the deployment validation still enforces full rollout by
comparing ready replicas to spec.replicas, which conflicts with the intended
min_controller_replicas threshold. Update the logic in the Deployment validation
path to use min_controller_replicas as the readiness comparison after the
existing desired < min_controller_replicas guard, and adjust the failure message
in that same block to reflect the threshold-based check using the deployment
validation symbols like desired, ready, and min_replicas.
K8sCsiDriverHealthCheck
What it tests: Verifies that the CSI drivers backing the configured StorageClasses are installed and running in the cluster.
For each unique provisioner derived from the configured StorageClasses, it runs:
storageclass-found[<sc>]spec.provisioner)csidriver-registered[<provisioner>]CSIDriver/<provisioner>object exists in the clustercontroller-deployment-healthy[<provisioner>]spec.replicas >= min_controller_replicas(default 1) and all replicas Readynode-daemonset-healthy[<provisioner>]Success criteria: Every configured StorageClass resolves to a registered
CSIDriver; the controller and node workload subtests pass when aworkloadsblock is configured (otherwise they are reported as skipped, not failed).Configuration notes:
driversis a list of specs, each anchored on one or morestorage_classes. The whole check skips (passes) when no StorageClasses resolve.workloadsblock (namespace,controller.deployment,node.daemonset) enables the controller/node health subtests. If omitted, it checks sc registration only.min_controller_replicas(default1) sets the minimum Ready controller replicas.Summary by CodeRabbit
New Features
Bug Fixes