RFC: dmsg over UDP — async store-and-forward semantic + peer overlay

## Summary

Today dmsg runs over **long-lived yamux-multiplexed TCP** between visors and dmsg servers. Every dmsg-addressed message requires both endpoints to be live on the same TCP/yamux session at delivery time — there is no queueing, no store-and-forward, no path to bypass an unreachable dmsg server. This issue proposes exploring a **UDP-based dmsg transport** alongside the existing TCP one.

The interesting unlock isn't UDP-vs-TCP itself — it's the **asynchronous messaging semantic** that UDP makes natural.

## Motivation

### 1. Store-and-forward (the big one)

A UDP-based dmsg session can queue datagrams server-side when the recipient visor is offline, then deliver on reconnect — the way email / push notifications / Matrix already work. Today's TCP/yamux model is more like IRC: both ends must be live on the same session or the message simply doesn't get delivered.

Use cases this unlocks:
- Mobile / intermittent-connectivity visors that aren't online 24/7
- Asynchronous APIs (skychat group fanout, deferred TPD publish, reward / metrics submission with retry-on-reconnect)
- Push-style notifications without a separate push subsystem

### 2. Peer dmsg overlay (NAT traversal)

UDP enables hole-punching, so visors could relay dmsg for each other in addition to the dedicated dmsg servers. That removes the dmsg-server-as-SPOF without adding TCP-NAT complexity. dmsg becomes a true peer overlay where any reachable visor can serve as a stepping stone.

### 3. Avoiding yamux head-of-line blocking

A slow consumer on one yamux stream backs up the whole TCP session today. A datagram-style dmsg has natural per-message independence (or QUIC-style independent substreams) — one slow sink doesn't drag down the rest.

### 4. RST-injection / firewall-TCP-RST resistance

Some censoring middleboxes inject TCP RST. A UDP-based dmsg sidesteps that class of attack, similar to how QUIC was motivated for the open web.

## Sketch

Two layering options:

**(A) UDP with KCP/app-layer retransmit, mirroring SUDPH**
- Reuse the existing KCP-on-UDP code that SUDPH transports already use
- Maps well onto current dmsg framing
- Watch for the failure modes [PR #3003](https://github.com/skycoin/skywire/pull/3003) just fixed (no liveness detection over KCP-on-UDP → silent dead conns post-AR-restart)

**(B) QUIC-based dmsg**
- TLS-equivalent built-in (we'd use noise instead), multiplexed substreams, congestion control done well, RST-immune
- Heavier dep (`quic-go`), but the protocol is mature and the API is reasonable
- Stream-oriented surface so the existing yamux-based dmsg client doesn't need rewriting — just the transport swaps

In both cases the visor↔dmsg-server protocol gains:
- Per-message acks (so store-and-forward can know what was delivered)
- A retrieval API for the recipient to pull queued messages on reconnect
- Optional TTL on queued messages

## Tradeoffs

- **Complexity**: KCP route reinvents pieces of TCP imperfectly; QUIC pulls in a non-trivial dep. PR #3003's recent bug is fresh evidence that UDP-on-skywire is real engineering, not a free transport swap.
- **dmsg's current design is server-relay-centric** — UDP gains less for relay-only than it would for true p2p. If we don't pursue the peer-overlay angle (motivation #2), TCP's reliability is hard to beat for server-relay.
- **Existing SUDPH already provides UDP-based skywire connectivity** for direct p2p. The question is whether dmsg's *control-plane / messaging* layer should also have a UDP option, or whether SUDPH's data-plane role is enough.
- **Backwards compatibility**: dmsg-servers and dmsg-clients would need to negotiate transport. Cleanest path is a per-server capability flag and clients prefer UDP+QUIC when both ends support it.

## Open questions

1. Is store-and-forward (motivation #1) interesting enough on its own to justify the work, even without the peer-overlay angle?
2. Should this be a new dmsg version (dmsg/v2) or a parallel transport that existing dmsg can fall back through?
3. KCP-on-UDP (reusing SUDPH machinery) vs QUIC — anyone strongly prefer one?
4. Who else has thought about this — pointers to prior discussion / related work appreciated.

cc folks running visors in intermittent-connectivity environments — your use cases would shape this most.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: dmsg over UDP — async store-and-forward semantic + peer overlay #3005

Summary

Motivation

1. Store-and-forward (the big one)

2. Peer dmsg overlay (NAT traversal)

3. Avoiding yamux head-of-line blocking

4. RST-injection / firewall-TCP-RST resistance

Sketch

Tradeoffs

Open questions

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

RFC: dmsg over UDP — async store-and-forward semantic + peer overlay #3005

Description

Summary

Motivation

1. Store-and-forward (the big one)

2. Peer dmsg overlay (NAT traversal)

3. Avoiding yamux head-of-line blocking

4. RST-injection / firewall-TCP-RST resistance

Sketch

Tradeoffs

Open questions

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions