Skip to content

USHIFT-6796: C2CC: DNS forwarding between clusters#6638

Open
pmtk wants to merge 7 commits into
openshift:mainfrom
pmtk:c2cc/coredns
Open

USHIFT-6796: C2CC: DNS forwarding between clusters#6638
pmtk wants to merge 7 commits into
openshift:mainfrom
pmtk:c2cc/coredns

Conversation

@pmtk
Copy link
Copy Markdown
Member

@pmtk pmtk commented May 8, 2026

Summary by CodeRabbit

  • New Features

    • Added configurable DNS cache TTL and negative TTL settings for cluster-to-cluster remote cluster DNS resolution, enabling fine-tuned control over DNS caching behavior for improved cross-cluster connectivity performance.
  • Documentation

    • Updated configuration documentation examples to include new DNS cache settings for cluster-to-cluster connectivity configuration.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 8, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 8, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 8, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 8, 2026

@pmtk: This pull request references USHIFT-6796 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 8, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pmtk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 9a1ea3a2-c002-43fe-8a07-44cfa89e2c66

📥 Commits

Reviewing files that changed from the base of the PR and between e2ed4a9 and 713be41.

📒 Files selected for processing (16)
  • cmd/generate-config/config/config-openapi-spec.json
  • docs/user/howto_config.md
  • packaging/microshift/config.yaml
  • pkg/components/controllers.go
  • pkg/config/c2cc.go
  • pkg/config/c2cc_test.go
  • pkg/config/config.go
  • test/resources/c2cc.resource
  • test/scenarios-bootc/el10/presubmits/[email protected]
  • test/scenarios-bootc/el9/presubmits/[email protected]
  • test/suites/c2cc/01-sanity.robot
  • test/suites/c2cc/02-infrastructure.robot
  • test/suites/c2cc/03-connectivity.robot
  • test/suites/c2cc/04-dns.robot
  • test/suites/c2cc/05-reconciliation.robot
  • test/suites/c2cc/06-cleanup.robot
✅ Files skipped from review due to trivial changes (2)
🚧 Files skipped from review as they are similar to previous changes (5)
  • cmd/generate-config/config/config-openapi-spec.json
  • pkg/config/config.go
  • pkg/components/controllers.go
  • pkg/config/c2cc_test.go
  • pkg/config/c2cc.go

Walkthrough

This PR adds C2CC cross-cluster DNS cache configuration. It introduces configurable TTLs for CoreDNS positive/negative caching, computes upstream DNS IPs from remote cluster service networks, generates CoreDNS server blocks with cache directives, and validates the complete flow with unit tests and end-to-end Robot Framework tests.

Changes

Cluster-to-Cluster DNS Configuration

Layer / File(s) Summary
DNS Configuration Types & Validation
pkg/config/c2cc.go, pkg/config/c2cc_test.go
New C2CCDNS struct with CacheTTL and CacheNegativeTTL fields; ResolvedRemoteCluster gains DNSIP field computed from service network. Validation rejects negative TTL values and errors on DNS IP derivation failures.
DNS Block Rendering
pkg/config/c2cc.go, pkg/config/c2cc_test.go
RenderC2CCDNSBlocks updated to accept cache parameters, emit CoreDNS server blocks only for entries with domains, and embed cache/denial directives using provided TTL integers.
Configuration Defaults & Schema
pkg/config/config.go, cmd/generate-config/config/config-openapi-spec.json, packaging/microshift/config.yaml, docs/user/howto_config.md
Config initialization sets DNS cache defaults to 10; user settings backfill missing TTL pointers. OpenAPI schema requires dns object with non-negative TTL integers. Default config.yaml and documentation updated with examples.
Controller & CoreDNS Template Integration
pkg/components/controllers.go
startDNSController wires C2CCDNSBlocks render parameter, populated via RenderC2CCDNSBlocks when C2CC is enabled, for inclusion in CoreDNS Corefile.
Robot Framework Test Infrastructure & E2E Tests
test/resources/c2cc.resource, test/scenarios-bootc/el9/presubmits/[email protected], test/scenarios-bootc/el10/presubmits/[email protected], test/suites/c2cc/03-connectivity.robot, test/suites/c2cc/04-dns.robot
New keywords for per-cluster namespace creation and Corefile validation; updated connectivity suite to use namespace mappings and assert source IP preservation; new DNS suite validates Corefile server blocks, DNS resolution, and HTTP reachability with proper response content. Test scenario scripts updated to include DNS test suite.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • openshift/microshift#6545: Extends the C2CC implementation by adding DNSIP computation and rendering of C2CC CoreDNS blocks with cache TTL parameters.

Suggested labels

lgtm, verified

Suggested reviewers

  • kasturinarra
  • eslutsky
  • vanhalenar
🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 4.76% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely describes the main feature being implemented: DNS forwarding between clusters in C2CC, which aligns with the extensive changes across configuration, testing, and documentation files.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names are stable and deterministic. Go tests use static table names; Robot Framework tests use descriptive static strings without dynamic values like pod names, IPs, timestamps, or UUIDs.
Test Structure And Quality ✅ Passed No Ginkgo tests in PR. Repo uses Go testing with testify. New c2cc_test.go tests properly follow patterns: table-driven tests, helpers, and meaningful assertions.
Microshift Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. All new/modified tests are Robot Framework tests or Go unit tests, making the Ginkgo-specific check inapplicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No Ginkgo e2e tests added. Check targets Ginkgo tests (It(), Describe()), not Robot Framework tests or Go unit tests. Not applicable to this PR.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds C2CC DNS forwarding via config changes and test coverage. No deployment manifests with affinity rules, nodeSelector constraints, or topology-dependent logic were added or modified.
Ote Binary Stdout Contract ✅ Passed No stdout writes detected in modified Go code. Modified functions use only fmt.Sprintf/Errorf, no direct stdout. No process-level entry points compromised.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed Custom check applies only to Ginkgo e2e tests (It/Describe/Context). This PR adds Robot Framework tests and Go unit tests (testify), not Ginkgo e2e tests. Check is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.2)

level=warning msg="The linter 'gomodguard' is deprecated (since v2.12.0) due to: new major version. Replaced by gomodguard_v2."
level=warning msg="Suggested new configuration:\nlinters:\n enable:\n - gomodguard_v2\n"
level=error msg="Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: inconsistent vendoring in :\n\tgithub.com/apparentlymart/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/coreos/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/google/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/miekg/[email protected]: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/openshift/[email protected]: is

... [truncated 31032 characters] ...

elet: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/metrics: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/mount-utils: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/pod-security-admission: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/sample-apiserver: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/sample-cli-plugin: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/sample-controller: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\n\tTo ignore the vendor directory, use -mod=readonly or -mod=mod.\n\tTo sync the vendor directory, run:\n\t\tgo mod vendor\n"


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/suites/c2cc/dns.robot`:
- Around line 71-77: Make namespace creation idempotent: change the "Oc On
Cluster    ${alias}    oc create namespace ${NAMESPACE}" call in the "Deploy DNS
Test Workloads" block (and the similar calls at lines 88-93) so it doesn't fail
if the namespace already exists or is terminating — e.g., check for existence
before creating (using the same "Oc On Cluster" helper to run an "oc get
namespace ${NAMESPACE}" and only run create if absent) or replace create with an
idempotent operation; also ensure teardown ignores or handles delete errors
consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 273f8e99-0d08-4cb7-bb27-08129a56d98f

📥 Commits

Reviewing files that changed from the base of the PR and between 987ae4d and 1a5258b.

📒 Files selected for processing (10)
  • assets/components/openshift-dns/dns/configmap.yaml
  • pkg/components/controllers.go
  • pkg/config/c2cc.go
  • pkg/config/c2cc_test.go
  • pkg/controllers/c2cc/helpers_test.go
  • test/assets/c2cc/hello-microshift.yaml
  • test/resources/c2cc.resource
  • test/scenarios-bootc/el9/presubmits/[email protected]
  • test/suites/c2cc/connectivity.robot
  • test/suites/c2cc/dns.robot

Comment thread test/suites/c2cc/dns.robot Outdated
@pmtk
Copy link
Copy Markdown
Member Author

pmtk commented May 8, 2026

/test verify

Avoid namespace collisions on reruns by generating a random namespace
per cluster instead of using a hardcoded name. Also flatten nested
validation logic in c2cc.go to satisfy the nestif linter.

Co-Authored-By: Claude Opus 4.6 <[email protected]>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/config/c2cc.go (1)

360-367: ⚡ Quick win

Skip domain blocks when DNSIP is empty.

If a ResolvedRemoteCluster has Domain but an empty DNSIP, the generated forward directive is invalid and can break Corefile rendering. Add a defensive guard here.

Suggested patch
 func RenderC2CCDNSBlocks(resolved []ResolvedRemoteCluster) string {
 	var blocks []string
 	for _, rc := range resolved {
-		if rc.Domain == "" {
+		if rc.Domain == "" || rc.DNSIP == "" {
 			continue
 		}
 		blocks = append(blocks, formatDNSBlock(rc.Domain, rc.DNSIP))
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/config/c2cc.go` around lines 360 - 367, RenderC2CCDNSBlocks currently
appends DNS blocks for every ResolvedRemoteCluster with a Domain, but if
rc.DNSIP is empty the resulting forward directive is invalid; update
RenderC2CCDNSBlocks to skip entries where rc.DNSIP == "" (i.e., treat both
rc.Domain and rc.DNSIP as required) before calling formatDNSBlock, so only
clusters with non-empty DNSIP produce blocks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/config/c2cc.go`:
- Around line 360-367: RenderC2CCDNSBlocks currently appends DNS blocks for
every ResolvedRemoteCluster with a Domain, but if rc.DNSIP is empty the
resulting forward directive is invalid; update RenderC2CCDNSBlocks to skip
entries where rc.DNSIP == "" (i.e., treat both rc.Domain and rc.DNSIP as
required) before calling formatDNSBlock, so only clusters with non-empty DNSIP
produce blocks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 36664956-24a7-47cc-ae45-f4a7e5e341dc

📥 Commits

Reviewing files that changed from the base of the PR and between 1a5258b and e4d674b.

📒 Files selected for processing (4)
  • pkg/config/c2cc.go
  • test/resources/c2cc.resource
  • test/suites/c2cc/connectivity.robot
  • test/suites/c2cc/dns.robot
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/resources/c2cc.resource
  • test/suites/c2cc/connectivity.robot

@pmtk
Copy link
Copy Markdown
Member Author

pmtk commented May 8, 2026

/test verify

@pmtk pmtk marked this pull request as ready for review May 8, 2026 14:44
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 8, 2026
@openshift-ci openshift-ci Bot requested review from eslutsky and kasturinarra May 8, 2026 14:45
@pmtk
Copy link
Copy Markdown
Member Author

pmtk commented May 18, 2026

/retest

@pmtk
Copy link
Copy Markdown
Member Author

pmtk commented May 18, 2026

/verified by @pmtk

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label May 18, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@pmtk: This PR has been marked as verified by @pmtk.

Details

In response to this:

/verified by @pmtk

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Comment thread test/suites/c2cc/04-dns.robot
Comment thread test/suites/c2cc/dns.robot Outdated
Comment on lines +71 to +95
Deploy DNS Test Workloads
[Documentation] Create namespace and deploy hello-microshift + curl-pod on both clusters.
VAR ${assets}= ${EXECDIR}/assets/c2cc
FOR ${alias} IN cluster-a cluster-b
${ns}= Create Unique Namespace On Cluster ${alias}
Set To Dictionary ${NAMESPACES} ${alias} ${ns}
Oc On Cluster ${alias} oc apply -n ${ns} -f ${assets}/hello-microshift.yaml
Oc On Cluster ${alias} oc apply -n ${ns} -f ${assets}/curl-pod.yaml
END
Wait For DNS Test Pods

Wait For DNS Test Pods
[Documentation] Wait for all test pods to be Ready on both clusters.
FOR ${alias} IN cluster-a cluster-b
Oc On Cluster
... ${alias}
... oc wait pod/hello-microshift pod/curl-pod -n ${NAMESPACES}[${alias}] --for=condition=Ready --timeout=120s
END

Cleanup DNS Test Workloads
[Documentation] Delete test namespace on both clusters. Ignores errors.
FOR ${alias} IN cluster-a cluster-b
Run Keyword And Ignore Error
... Oc On Cluster ${alias} oc delete namespace ${NAMESPACES}[${alias}] --timeout=60s
END
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These 3 Keywords are almost the same as the ones in connectivity.robot. I'd suggest to move all of them into c2cc.resource

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the order is not really important, because it's mainly for human. The tests does not depend on each other, so it's ok.

@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label May 18, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 18, 2026

@pmtk: This pull request references USHIFT-6796 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary by CodeRabbit

  • New Features

  • Cross-cluster DNS server blocks added to CoreDNS for remote service discovery; remote cluster DNS addresses are now derived when available.

  • DNS cache tuning options introduced with sensible defaults.

  • Source IP preservation implemented for cross-cluster traffic.

  • Tests

  • New end-to-end DNS test suite validating CoreDNS server blocks, DNS resolution, and HTTP access across clusters.

  • Connectivity tests enhanced; test workloads updated (CGI hello service, per-cluster namespaces).

  • Documentation

  • Configuration docs and examples updated to include cluster-to-cluster DNS cache settings.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/generate-config/config/config-openapi-spec.json`:
- Around line 192-199: The schema fields cacheNegativeTTL and cacheTTL declare
they must be >= 0 but lack enforcement; update each property's JSON Schema to
include "minimum": 0 (and keep "type": "integer") so the OpenAPI/JSON schema
validates non-negative TTLs (refer to the cacheNegativeTTL and cacheTTL
properties in the diff and add minimum: 0 to each).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 6712f991-d1f5-4d85-8c7b-4769a8f70065

📥 Commits

Reviewing files that changed from the base of the PR and between e4d674b and 9fcb422.

📒 Files selected for processing (7)
  • cmd/generate-config/config/config-openapi-spec.json
  • docs/user/howto_config.md
  • packaging/microshift/config.yaml
  • pkg/components/controllers.go
  • pkg/config/c2cc.go
  • pkg/config/c2cc_test.go
  • pkg/config/config.go

Comment thread cmd/generate-config/config/config-openapi-spec.json
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 18, 2026

@pmtk: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-tests-bootc-el9 713be41 link true /test e2e-aws-tests-bootc-el9

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@pmtk
Copy link
Copy Markdown
Member Author

pmtk commented May 19, 2026

/retest

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants