Skip to content

Make functional tests workspace-aware and stabilize local debug stack#11975

Closed
sylvainsf wants to merge 3 commits into
mainfrom
sylvainsf/local-functional
Closed

Make functional tests workspace-aware and stabilize local debug stack#11975
sylvainsf wants to merge 3 commits into
mainfrom
sylvainsf/local-functional

Conversation

@sylvainsf

@sylvainsf sylvainsf commented May 21, 2026

Copy link
Copy Markdown
Contributor

Summary

Lets the Radius functional test suites — both corerp-noncloud and
corerp-cloud (Azure) — run against an OS-process Radius debug stack
(make debug-start) on an arbitrary k3d/kind cluster, instead of being
hard-wired to a kind-radius cluster and workspace. Also stabilizes a few
make debug-* targets that didn't reliably bring up their dependencies
on k3d.

Changes

Test infrastructure — workspace-awareness

  • test/functional-portable/corerp/util.go
    • GetSecretSuffix derives the resource group from the active workspace
      (cli.LoadConfigGetWorkspaceParseScope
      FindScope("resourcegroups")) instead of hardcoding kind-radius.
      Falls back to default when nothing is configured.
    • backends.NewKubernetesBackend is constructed from the active workspace
      rather than an assumed context/namespace.
  • test/functional-portable/corerp/noncloud/resources/application_test.go
    • Test_ApplicationGraph PostStepVerify substitutes the fixture's
      kind-radius resource group with the active workspace's resource group
      before unmarshalling.
  • test/functional-portable/corerp/cloud/resources/recipe_terraform_test.go
    • Derives the resource ID from the active workspace scope so it works
      against any RG (CI's kind-radius and local debug's default).
  • test/functional-portable/corerp/cloud/resources/extender_test.go,
    test/rp/rptest.go, test/ucp/ucptest.go, test/validation/shared.go,
    test/functional-portable/cli/noncloud/cli_test.go,
    test/functional-portable/corerp/noncloud/resources/testdata/corerp-resources-simulatedenv.bicep
    — incidental cleanups required to run the suites against a
    non-kind-radius workspace; skip AWS-only tests cleanly when AWS env
    vars are unset; skip private-git redis test when GH_TOKEN is unset.

Azure-cloud functional tests against a local OS-process stack

  • build/scripts/azure-local-testenv.sh — new orchestrator with
    setup/run/teardown/all sub-commands. run and all accept
    passthrough go test flags (e.g. -run, -v).
    • Auto-recovery: run rebuilds state from the newest
      radlocal-${USER}-* resource group when the state file is missing
      (e.g. after make debug-stop), and re-applies the Azure scope on the
      default rad environment that debug-start wipes.
    • Orphan GC: teardown --all-orphans deletes every
      radlocal-${USER}-* RG and stops the tf-module-server
      port-forward.
  • pkg/recipes/terraform/config/providers/azure.go — Terraform Azure
    provider falls back to use_cli = true when no Azure credential is
    registered with UCP (404), so the host RP's az login session
    authenticates. CI workload-identity path is unchanged.
  • build/scripts/start-radius.sh — exports
    TERRAFORM_TEST_GLOBAL_DIR so the RP no longer tries to write to a
    read-only /terraform.
  • build/scripts/ensure-encryption-key.sh (new) — generates a stable
    encryption key for the local stack.

make debug-* reliability

  • build/debug.mk
    • debug-install-contour: drop Helm --wait (it doesn't behave for
      LoadBalancer Services on k3d) and instead do explicit
      kubectl wait --for=condition=Available + kubectl rollout status,
      so the target only returns once Contour is actually serving.
    • debug-install-tf-module-server: deploy the in-cluster nginx test
      module server and port-forward it to localhost:8999; add a curl
      readiness probe so subsequent recipe pulls don't race the pod
      becoming Ready.
  • build/test.mk, build/recipes.mk,
    .github/scripts/publish-recipes.sh,
    build/scripts/mirror-test-images.sh (new) — companion glue for
    running the suite locally with mirrored images and locally published
    recipes (the publish script learns PLAIN_HTTP for localhost:5000
    pushes).

Misc

  • pkg/azure/clientv2/unfold.go,
    pkg/corerp/frontend/controller/applications/updatefilter.go,
    pkg/recipes/engine/engine.go — small adjustments surfaced while
    running the suites end-to-end.
  • pkg/corerp/frontend/controller/applications/testbicep_scan_test.go
    (new) — small scan test added during investigation.
  • .gitignore — ignore local debug artifacts and the local-only
    bicepconfig.json override that make debug-publish-bicep-types
    writes.

Documentation

  • docs/contributing/contributing-code/contributing-code-debugging/radius-os-processes-debugging.md
    documents running the Azure cloud suite against the local OS-process
    stack via make debug-start + azure-local-testenv.sh.

How to use locally

make debug-start
make debug-install-contour
make debug-install-tf-module-server

# corerp-noncloud
go test -count=1 -timeout 30m \
  ./test/functional-portable/corerp/noncloud/resources/...

# corerp-cloud (Azure) against the same local stack, using host az login
build/scripts/azure-local-testenv.sh all

@github-actions

github-actions Bot commented May 21, 2026

Copy link
Copy Markdown

Dependency Review

✅ No vulnerabilities or license issues or OpenSSF Scorecard issues found.

Scanned Files

None

@codecov

codecov Bot commented May 21, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 71.42857% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 51.81%. Comparing base (72050b2) to head (8be9a40).
⚠️ Report is 31 commits behind head on main.

Files with missing lines Patch % Lines
pkg/recipes/engine/engine.go 57.14% 3 Missing ⚠️
pkg/recipes/terraform/config/providers/azure.go 40.00% 2 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #11975      +/-   ##
==========================================
- Coverage   51.83%   51.81%   -0.02%     
==========================================
  Files         728      728              
  Lines       45960    45971      +11     
==========================================
- Hits        23824    23822       -2     
- Misses      19868    19876       +8     
- Partials     2268     2273       +5     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions

github-actions Bot commented May 21, 2026

Copy link
Copy Markdown

Unit Tests

    2 files  ±0    429 suites  ±0   7m 28s ⏱️ -1s
5 200 tests +1  5 198 ✅ +1  2 💤 ±0  0 ❌ ±0 
6 278 runs  +1  6 276 ✅ +1  2 💤 ±0  0 ❌ ±0 

Results for commit 8be9a40. ± Comparison against base commit 72050b2.

♻️ This comment has been updated with latest results.

sylvainsf added 3 commits May 20, 2026 21:14
Adds a workflow for running the corerp/cloud Azure functional tests
against a local OS-process Radius stack (`make debug-start`) using the
host's `az login` credentials, with no service-principal/workload-identity
registration required.

Highlights

- New `build/scripts/azure-local-testenv.sh` orchestrator with
  `setup`, `run`, `teardown`, `all` sub-commands. `run` and `all` accept
  passthrough `go test` flags (e.g. `-run`, `-v`).
- Auto-recovery: `run` rebuilds state from the newest
  `radlocal-${USER}-*` resource group when the state file is missing
  (e.g. after `make debug-stop`), and re-applies the Azure scope on the
  default rad environment that `debug-start` wipes.
- Orphan GC: `teardown --all-orphans` deletes every
  `radlocal-${USER}-*` RG and stops the `tf-module-server` port-forward.
- `tf-module-server` bootstrap: deploys the in-cluster nginx test module
  server and port-forwards it to `localhost:8999` automatically when not
  already reachable.
- Terraform Azure provider falls back to `use_cli = true` when no Azure
  credential is registered with UCP (404), letting the host RP's
  `az login` session authenticate. CI workload-identity path is
  unchanged.
- `start-radius.sh` exports `TERRAFORM_TEST_GLOBAL_DIR` so the RP no
  longer tries to write to read-only `/terraform`.
- AWS-required tests skip cleanly via `t.Skip` when AWS env vars are
  unset; private-git redis test skips when `GH_TOKEN` is unset.
- `recipe_terraform_test.go` now derives the resource ID from the
  active workspace scope so it works against any RG (CI's `kind-radius`
  and local debug's `default`).

Tested

Full `corerp/cloud/...` suite green locally:
- PASS: `Test_AzureConnections`, `Test_ACI`, `Test_TerraformRecipe_AzureResourceGroup`
- SKIP: AWS-only tests, `Test_TerraformPrivateGitModule_KubernetesRedis`,
  `Test_Storage`/`Test_PersistentVolume` (issue #7853, pre-existing)

Documentation in
`docs/contributing/contributing-code/contributing-code-debugging/radius-os-processes-debugging.md`.

Signed-off-by: Sylvain Niles <[email protected]>
- GetSecretSuffix derives the resource group from the active workspace
  instead of hardcoding kind-radius, so tests pass on local debug stack.
- Test_ApplicationGraph rewrites fixture resource group at runtime to
  match the active workspace.
- debug.mk: install Contour and tf-module-server with explicit rollout
  checks (Helm --wait does not work on k3d for LoadBalancer services).
- Misc test/CLI cleanups for running corerp-noncloud against an
  OS-process Radius stack.
This broke pkg/recipes/driver/bicep Test_Bicep_GetRecipeMetadata_*,
which runs a fake HTTPS registry on 127.0.0.1. With the loopback
heuristic the driver issued http:// requests to an HTTPS server and
got '400 Bad Request' instead of the expected 'not found'.
@sylvainsf sylvainsf force-pushed the sylvainsf/local-functional branch from be17e1e to 8be9a40 Compare May 21, 2026 04:16
@radius-functional-tests

radius-functional-tests Bot commented May 21, 2026

Copy link
Copy Markdown

Radius functional test overview

🔍 Go to test action run

Click here to see the test run details
Name Value
Repository radius-project/radius
Commit ref 8be9a40
Unique ID funcead67aebb7
Image tag pr-funcead67aebb7
  • gotestsum 1.13.0
  • KinD: v0.29.0
  • Dapr: 1.14.4
  • Azure KeyVault CSI driver: 1.4.2
  • Azure Workload identity webhook: 1.3.0
  • Bicep recipe location ghcr.io/radius-project/dev/test/testrecipes/test-bicep-recipes/<name>:pr-funcead67aebb7
  • Terraform recipe location http://tf-module-server.radius-test-tf-module-server.svc.cluster.local/<name>.zip (in cluster)
  • applications-rp test image location: ghcr.io/radius-project/dev/applications-rp:pr-funcead67aebb7
  • dynamic-rp test image location: ghcr.io/radius-project/dev/dynamic-rp:pr-funcead67aebb7
  • controller test image location: ghcr.io/radius-project/dev/controller:pr-funcead67aebb7
  • ucp test image location: ghcr.io/radius-project/dev/ucpd:pr-funcead67aebb7
  • deployment-engine test image location: ghcr.io/radius-project/deployment-engine:latest

Test Status

⌛ Building Radius and pushing container images for functional tests...
✅ Container images build succeeded
⌛ Publishing Bicep Recipes for functional tests...
✅ Recipe publishing succeeded
⌛ Starting ucp-cloud functional tests...
⌛ Starting corerp-cloud functional tests...
✅ ucp-cloud functional tests succeeded
✅ corerp-cloud functional tests succeeded

@sylvainsf sylvainsf marked this pull request as ready for review June 9, 2026 17:29
@sylvainsf sylvainsf requested review from a team as code owners June 9, 2026 17:29
Copilot AI review requested due to automatic review settings June 9, 2026 17:29
@sylvainsf

Copy link
Copy Markdown
Contributor Author

Closing in favor of #11904, which contains every commit from this branch plus the postgres TIMESTAMPTZ pagination fix, the noncloud-test-learnings doc, additional debug-stack stabilization, and the CI workflow unblock. The PR description over there has been updated to a synthesis covering all of it.

@sylvainsf sylvainsf closed this Jun 9, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR makes Radius functional tests and the local OS-process debug stack more portable by removing assumptions about a hard-coded kind-radius setup and improving the reliability of the make debug-* workflow. It also adds a local Azure test orchestrator and adjusts Terraform/Azure credential handling to support running against a locally started stack.

Changes:

  • Make functional test helpers workspace-aware (derive scope/resource group from the active rad workspace; adjust fixtures/tests accordingly).
  • Add/extend local-dev tooling for Azure cloud tests and OS-process debug runs (new orchestrator scripts, Make targets, debug-stack stabilization).
  • Improve recipe engine/provider behavior and add guardrails/tests/docs discovered during end-to-end runs.

Reviewed changes

Copilot reviewed 24 out of 25 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
test/validation/shared.go Avoid list-based read-after-write races; add local-dev bypass for cloud credential checks in tests.
test/ucp/ucptest.go Prefer active workspace connection (supports UCP override for local OS-process stacks).
test/rp/rptest.go Delete RP resources in reverse order to avoid environment deletion racing child cleanup.
test/functional-portable/corerp/util.go Derive RG from workspace scope; rewrite resource IDs to match workspace RG for secret-suffix calculation.
test/functional-portable/corerp/noncloud/resources/testdata/corerp-resources-simulatedenv.bicep Stop mutating shared default environment by using a uniquely named env/namespace.
test/functional-portable/corerp/noncloud/resources/application_test.go Rewrite fixture RG on read based on active RootScope before asserting ApplicationGraph.
test/functional-portable/corerp/cloud/resources/recipe_terraform_test.go Use workspace scope for resource IDs; skip private-Git terraform-module test locally when GH_TOKEN is unset.
test/functional-portable/corerp/cloud/resources/extender_test.go Skip AWS log-group test when required AWS env vars aren’t present (instead of failing).
test/functional-portable/cli/noncloud/cli_test.go Add a watchdog to prevent rad run streaming-log test from hanging indefinitely.
test/createAzureTestResources.bicep Parameterize Cosmos account name (keep default) to support unique naming in local orchestration.
pkg/recipes/terraform/config/providers/azure.go Fall back to Azure CLI auth (use_cli = true) when Radius-managed credentials are missing.
pkg/recipes/engine/engine.go Treat “environment not found” (404) during recipe delete config-load as a successful no-op.
pkg/corerp/frontend/controller/applications/updatefilter.go Improve bad-request error messaging for invalid app-scoped namespaces (include lengths).
pkg/corerp/frontend/controller/applications/testbicep_scan_test.go Add scan test to prevent test Bicep from mutating shared default environment.
pkg/azure/clientv2/unfold.go Preserve response body for repeated unfolding by resetting resp.Body after reading.
docs/contributing/contributing-code/contributing-code-debugging/radius-os-processes-debugging.md Document local DE usage and local Azure functional test flow against OS-process stack.
build/test.mk Auto-detect local debug registry/CLI and git-http backend; add local Azure functional test make targets; improve gotestsum jsonfile support.
build/scripts/start-radius.sh Export a writable TERRAFORM_TEST_GLOBAL_DIR for OS-process runs on host filesystems.
build/scripts/mirror-test-images.sh New helper to mirror multi-arch images to ghcr.io/radius-project/mirror/*.
build/scripts/ensure-encryption-key.sh New helper to create a stable encryption-key secret for the OS-process debug stack.
build/scripts/azure-local-testenv.sh New Azure local test orchestrator (setup/run/teardown/all) with state recovery and orphan GC.
build/recipes.mk Support plain-http publishing for localhost recipe registry via PLAIN_HTTP.
build/debug.mk Add debug registry/git backend/flux/contour/tf-module server/bicep types automation; improve debug-start and DE handling.
.gitignore Ignore local debug artifacts (including local-only bicepconfig.json override).
.github/scripts/publish-recipes.sh Add optional --plain-http when publishing recipes (for localhost registry).

Comment on lines +105 to +112
current := parsed.FindScope(resources_radius.ScopeResourceGroups)
if current == "" || current == rg {
return resourceID
}
// Case-insensitive replacement of the resourcegroups segment value while preserving the
// rest of the ID exactly.
return strings.ReplaceAll(resourceID, "/"+current+"/", "/"+rg+"/")
}
Comment on lines +53 to +56
config, err := cli.LoadConfig("")
if err != nil {
return "kind-radius"
}
Comment thread build/test.mk
GOTESTSUM_JSONFILE_DIR ?=
# Recursive '=' so $@ resolves in each recipe's context.
# We need the double dash here to separate the 'gotestsum' options from the 'go test' options.
GOTEST_TOOL = gotestsum $(GOTESTSUM_OPTS)$(if $(GOTESTSUM_JSONFILE_DIR), --jsonfile=$(GOTESTSUM_JSONFILE_DIR)/[email protected]) --
Comment thread build/debug.mk
Comment on lines +616 to +620
@listener_cmd=""; \
if command -v lsof >/dev/null 2>&1; then \
listener_cmd=$$(lsof -nP -iTCP:5017 -sTCP:LISTEN 2>/dev/null | awk 'NR==2 {print $$1}'); \
fi; \
if [ -n "$$listener_cmd" ] && [ "$$listener_cmd" != "kubectl" ] && curl -s "http://localhost:5017/metrics" > /dev/null 2>&1; then \
Comment on lines +61 to +66
if ! command -v docker >/dev/null 2>&1; then
echo "docker is required" >&2; exit 1
fi
if ! docker buildx version >/dev/null 2>&1; then
echo "docker buildx is required (included with Docker Desktop)" >&2; exit 1
fi
Comment on lines +85 to +109
require_cmd kubectl make
if curl -sf -o /dev/null -m 2 http://localhost:8999/azure-rg.zip; then
log "tf-module-server already reachable at http://localhost:8999"
return 0
fi
if ! kubectl get ns "${TF_MODULE_SERVER_NS}" >/dev/null 2>&1 \
|| ! kubectl -n "${TF_MODULE_SERVER_NS}" get deploy tf-module-server >/dev/null 2>&1; then
log "Deploying tf-module-server into the debug cluster (publish-test-terraform-recipes)..."
(cd "${REPO_ROOT}" && make publish-test-terraform-recipes >/dev/null) \
|| { err "make publish-test-terraform-recipes failed"; exit 1; }
fi
log "Waiting for tf-module-server rollout..."
kubectl -n "${TF_MODULE_SERVER_NS}" rollout status deploy/tf-module-server --timeout=120s >/dev/null \
|| { err "tf-module-server rollout did not become ready"; exit 1; }
# Stop any stale port-forward before starting a new one.
if [[ -f "${TF_MODULE_SERVER_PORT_FORWARD_PID_FILE}" ]]; then
local old_pid
old_pid="$(cat "${TF_MODULE_SERVER_PORT_FORWARD_PID_FILE}" 2>/dev/null || true)"
if [[ -n "${old_pid}" ]] && kill -0 "${old_pid}" 2>/dev/null; then
kill "${old_pid}" 2>/dev/null || true
fi
rm -f "${TF_MODULE_SERVER_PORT_FORWARD_PID_FILE}"
fi
log "Starting kubectl port-forward svc/tf-module-server 8999:80 -n ${TF_MODULE_SERVER_NS}"
( kubectl -n "${TF_MODULE_SERVER_NS}" port-forward svc/tf-module-server 8999:80 \
Comment on lines +146 to +149
if clientv2.Is404Error(err) {
logger.Info("environment not found while loading recipe configuration for delete; treating as no-op")
return nil, nil
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants