Add Repo Radius deploy workflow#12243
Conversation
Dependency Review✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.Scanned FilesNone |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #12243 +/- ##
==========================================
- Coverage 52.89% 52.88% -0.01%
==========================================
Files 751 751
Lines 48353 48353
==========================================
- Hits 25574 25572 -2
- Misses 20383 20384 +1
- Partials 2396 2397 +1 ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
Port the Repo Radius deploy workflow from the github-extension prototype into .github/extension/ and adapt it to the merged building blocks: multi-cluster v1 (global.targetCluster seam) and externalized state (rad startup / rad shutdown). - radius-deploy.yml: ephemeral k3d control plane, restore state with rad startup, run the dispatched radius_commands, persist state with rad shutdown, tear down. Honors the workflow_dispatch contract (environment + radius_commands as a single string or JSON array; rad prefix omitted; stop-on-first-failure) and uploads per-command output as the radius-output artifact. - Credentials are provider-native, not registered: AWS OIDC session creds are injected as pod env vars; Azure uses Workload Identity with the GitHub Actions OIDC JWT projected as the federated token file. rad env update sets scope only. - radius-verify-credentials.yml: port the verify workflow so the contract is testable without a deploy. - README.md: document both workflows and the RADIUS_TARGET_KUBECONFIG / credential / state-persistence contract. - eng/design-notes: add the deploy-workflow technical design (Investment 3). Related: #12118 Signed-off-by: Sylvain Niles <[email protected]>
Remove references to the external prototype project and speak in generic terms (an earlier proof of concept, a separate project) so the design does not rely on naming an external repository. Related: #12118 Signed-off-by: Sylvain Niles <[email protected]>
…ated Re-point the `rad startup` / `rad shutdown` state-storage lifecycle test at this PR's deploy workflow model and remove the `RADIUS_STATE_E2E` gate so it runs in CI on its own dedicated, isolated cluster. - Add a `statestore-noncloud` leg to the non-cloud functional matrix. Each matrix leg runs on its own runner with its own KinD cluster, so the test's destructive install/uninstall/reinstall cycle never affects other legs. The shared "Install Radius" step is skipped for this leg because the test drives its own install. - Drive install with the build-under-test images (chart + per-RP image flags from testutil.SetDefault, DE_IMAGE/DE_TAG, and the secure local registry CA), mirroring the shared Install Radius step, plus `database.enabled=true` for the state backend. - Harden the lifecycle against the flakes seen in the upgrade test (#12245): replace fixed sleeps with polling — wait for the control plane treating 503 from the UCP aggregated APIService as retryable, and poll discovery until `api.ucp.dev/v1alpha3` deregisters before reinstalling so the next install doesn't race the teardown. - Add the `test-functional-statestore-noncloud` make target and a 40m timeout for the leg. Related: #12118 Signed-off-by: Sylvain Niles <[email protected]>
1590df2 to
122d84e
Compare
Functional Tests - statestore-noncloud1 tests 1 ✅ 3m 1s ⏱️ Results for commit 15d7e4d. ♻️ This comment has been updated with latest results. |
…kage The statestore test lives one directory deeper than the upgrade test it was modeled on (statestore/noncloud vs upgrade), so the chart path needs four ../ segments to reach the repo root, not three. CI failed with 'stat ../../../deploy/Chart: no such file or directory'. Signed-off-by: Sylvain Niles <[email protected]>
…dent waitForControlPlane called rp.NewRPTestOptions, which requires a configured rad workspace. It ran inside installRadius (before the test creates the workspace), so it panicked with 'default workspace is not set' every iteration and looped until the timeout. Poll the radius-system Deployments for the Available condition via the Kubernetes client instead — the real, workspace-independent readiness signal, matching the workflow's 'kubectl wait --for=condition=Available'. Signed-off-by: Sylvain Niles <[email protected]>
Two CI failures after install/deploy succeeded: - 'rad uninstall' prompted for confirmation and failed opening /dev/tty in CI. Pass --yes for non-interactive teardown. - 'rad shutdown' tried to push the radius-state branch to the checkout's GitHub origin, which has no push credentials in CI. Run shutdown/startup from a dedicated throwaway git repo with no remote (via a second CLI with WorkingDirectory set) so gitstate commits state locally only — the design's supported local/test case. Both commands share the same repo so the state committed by shutdown survives into startup. Signed-off-by: Sylvain Niles <[email protected]>
Radius functional test overviewClick here to see the test run details
Test Status⌛ Building Radius and pushing container images for functional tests... |
|
Closing as superseded. This draft's content has been carried forward (and further evolved) into sk593's #12250 (
|
Description
Lands the Repo Radius deploy workflow in the repository and adapts it to the
building blocks that have since merged, so the multi-cluster + state-storage work
has a stable, in-repo consumer to validate against and frontends can drive Repo
Radius without relying on any external project.
The deploy workflow previously existed only as a generated string produced
outside Radius, where the contract it depends on had no reviewed home. This PR
lands it in-tree under
.github/extension/and rewires it onto the merged seamsinstead of patching the RP/DE deployments by hand.
What's included
.github/extension/radius-deploy.yml— the deploy workflow. It creates anephemeral k3d control plane, restores durable state with
rad startup, runs thedispatched
radcommands against the user's external AKS/EKS cluster, persistsstate again with
rad shutdown, and tears the control plane down.workflow_dispatchcontract:environment+radius_commands(a single command string or a JSON array,
radprefix omitted, run inorder, stop on first failure). Each command's output is uploaded as the
radius-outputartifact for incremental polling.--set global.targetCluster.enabled=true)rather than patching the RP/DE deployments by hand.
RADIUS_TARGET_KUBECONFIGnow drives both Bicep and Terraform, so the separate
KUBE_CONFIG_PATHvariable is no longer needed.
injected as pod env vars; Azure uses Workload Identity, with the client/
tenant IDs registered and the GitHub Actions OIDC JWT projected into the pods
as the federated token file.
rad env updatesets scope only..github/extension/radius-verify-credentials.yml— ports the companionverify workflow so the contract is testable without a full deploy.
.github/extension/README.md— documents both workflows and theRADIUS_TARGET_KUBECONFIG/ credential / state-persistence contract.eng/design-notes/environments/2026-06-repo-radius-deploy-workflow.md— thedeploy-workflow technical design (Investment 3 of the Repo Radius feature spec).
test/functional-portable/statestore/.../statestore_lifecycle_test.go— therad startup/rad shutdownlifecycle test, re-pointed at this workflow's installmodel and un-gated to run in CI on its own dedicated, isolated KinD cluster (new
statestore-noncloudmatrix leg + make target). Hardened against the upgrade-testflakes (Fix flaky upgrade test: replace fixed sleeps with polling and increase timeouts #12245): polls for control-plane readiness (503-tolerant) and for aggregated
APIService deregistration instead of sleeping.
In scope / out of scope
rad startup/rad shutdownlifecycle while preserving the
workflow_dispatchcontract; porting the verifyworkflow so it can be exercised without a deploy.
provisioning; mid-run cloud-token refresh (an accepted limitation — a long Azure
run may outlive the one-time token exchange).
Status
rad startup/rad shutdown+database.enabled=truechart wiring) has merged; this branch isrebased onto it and the statestore lifecycle test is wired into CI and un-gated.
token refresh remain out of scope (tracked separately).
Type of change
Related: #12118
Contributor checklist
Please verify that the PR meets the following requirements, where applicable:
eng/design-notes/in this repository, if new APIs are being introduced.