Skip to content

fix: use system DNS resolver in litep2p backend#586

Merged
n13 merged 1 commit into
mainfrom
fix/litep2p-system-dns
Jun 10, 2026
Merged

fix: use system DNS resolver in litep2p backend#586
n13 merged 1 commit into
mainfrom
fix/litep2p-system-dns

Conversation

@n13

@n13 n13 commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

hickory-resolver 0.26 (bumped in #581) changed ResolverConfig::default() to an empty config with no nameservers, so all /dns/... bootnode and telemetry dials fail silently with NoConnections (0 peers). Use the system DNS config instead; startup now fails loudly if it can't be read.

@n13

n13 commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

Review

Verified the root cause against both resolver versions in Cargo.lock:

  • hickory-resolver 0.24.4: impl Default for ResolverConfig explicitly returned Google's nameservers (8.8.8.8, 8.8.4.4, …).
  • hickory-resolver 0.26.1: ResolverConfig is #[derive(Default)], so Default::default() has an empty name_servers list.

The vendored litep2p builds its resolver from Default::default() unless use_system_dns_config is set, so after #581 every /dns/... lookup had zero nameservers to query. Enabling with_system_resolver() here is the right fix.

Correctness checks:

  • Litep2p::new calls hickory_resolver::system_conf::read_system_conf() and propagates failure as Error::CannotReadSystemDnsConfig — startup fails loudly as described.
  • The resolver is built once and shared by both the TCP and WebSocket transports, so both dial paths are covered.
  • hickory's system-config feature is in its default feature set (unix resolv.conf, windows ipconfig).
  • This is the only production Litep2pConfigBuilder call site; the rest are tests using IP listen addresses.

Notes:

  1. Residual footgun: the default path in the vendored litep2p still silently builds a resolver with zero nameservers; any future caller that forgets .with_system_resolver() reinherits this failure. Filed litep2p: default DNS resolver config silently has zero nameservers (hickory-resolver 0.26) #587 to make system DNS config unconditional in the vendored crate.
  2. Telemetry claim in the description is likely inaccurate: telemetry goes through sc-telemetrylibp2p-dns, which pins hickory-resolver 0.24.4 and never touches litep2p's resolver, so telemetry dials are unaffected by this change.
  3. Ops/release-notes note: nodes in minimal containers without /etc/resolv.conf previously worked via the implicit Google DNS fallback (pre-Update rusty crystals #581) but will now fail at startup with CannotReadSystemDnsConfig even when no DNS multiaddrs are configured. Intended fail-loud trade-off, worth a mention in the v0.7.0 release notes.

LGTM otherwise.

@n13

n13 commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator Author

GPT review:
No issues found in PR #586.

The one-line change in client/network/src/litep2p/mod.rs correctly enables with_system_resolver(). In client/litep2p/src/lib.rs, that flag makes startup read the system DNS config and propagate CannotReadSystemDnsConfig on read/build failure, so the PR matches its “fail loudly” intent.

Verified with:

cargo check -p sc-network

Residual risk: I did not run an end-to-end node boot/dial test with /dns/... bootnodes, so runtime DNS behavior is still worth smoke-testing in the target deployment environment. I did not post anything to GitHub.

@n13 n13 merged commit f0275b0 into main Jun 10, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant