Skip to content

dataplane: split out derp and stun traffic#237

Open
npry wants to merge 3 commits into
mainfrom
npry/dataplane.split_traffic
Open

dataplane: split out derp and stun traffic#237
npry wants to merge 3 commits into
mainfrom
npry/dataplane.split_traffic

Conversation

@npry

@npry npry commented Jun 11, 2026

Copy link
Copy Markdown
Collaborator

stacked on #214

Port packet identification logic from Go (wireguard vs disco vs stun), begin partitioning incoming packet stream by type. This isn't used in runtime yet — that integration depends on a bunch of other changes I made here so I'm holding off on PRing that until this round of changes merges.

This also eliminates the requirement for underlay transports to supply a peer id, as this is in general unnecessary since packets must cryptographically authenticate themselves anyway, and it will simplify implementation for the UDP transport. Peer ids are still required to route to a peer.

Base automatically changed from npry/netmon to main June 13, 2026 10:30
@npry npry force-pushed the npry/dataplane.split_traffic branch 7 times, most recently from 09f9b3f to 68737f4 Compare June 18, 2026 18:06
@npry npry force-pushed the npry/dataplane.split_traffic branch from 68737f4 to 595ca80 Compare June 22, 2026 17:27

@danderson danderson left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple minor thoughts, but looks reasonable to me. The diff weirdness is worrying me a little, but it might just be github having a moment.

pub type DataplaneFromUnderlay = mpsc::UnboundedReceiver<(PeerId, Vec<PacketMut>)>;

// TODO: wire in overlay/underlay transport traits
// NOTE(npry): this used to have unique types for each queue, but the names got confusing due to

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I'll have to see how this feels in code. The naming problem I'm trying to avoid coming from the Go codebase is that we mixed up our directions a ton based on what made local sense, but that made it hard to globally understand the flow of traffic.

I agree with your problem statement that it's very confusing to have to wrangle both general dataflow and also which end of a channel you're holding. I was going to write here that I find something like Rx<FromOverlay> confusing to understand the direction of flow, but as I was writing it, I think the interpretation is unambiguous and conveys both pieces of information we want. Nice!

overlay_up: DataplaneToOverlay,
// These are the senders handed out in new_*_transport, just held here so we can clone them, we
// never send to them.
underlay_down: Tx<FromUnderlay>,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be Tx<ToUnderlay>? Previously underlay_down was described as dataplane -> underlay, so I would expect it to be a sending channel, with packet direction being ToUnderlay.

@npry npry Jun 30, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See the comment, these are inverted from what you'd want inside the dataplane because they're held for cloning for callers that need the opposite polarity

// These are the senders handed out in new_*_transport, just held here so we can clone them, we
// never send to them.
underlay_down: Tx<FromUnderlay>,
overlay_up: Tx<FromOverlay>,

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, if we're sending from dataplane to overlay, shouldn't this be Tx<ToOverlay> ?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto

DataplaneFromUnderlay,
DataplaneToUnderlay,
) {
) -> (UnderlayTransportId, Rx<ToUnderlay>, Tx<FromUnderlay>) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ToUnderlay and FromUnderlay are backwards here?

@npry npry Jun 30, 2026

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think so, an underlay is asking the dataplane to hand it a channel to receive messages from the dataplane (to the underlay) and to send messages to the dataplane (from the underlay)

pub async fn new_overlay_transport(
&self,
) -> (OverlayTransportId, DataplaneToOverlay, DataplaneFromOverlay) {
) -> (OverlayTransportId, Tx<FromOverlay>, Rx<ToOverlay>) {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From/To backwards?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't think so, this is returning channels to hand to an overlay transport 1. to send messages to the dataplane from the overlay, and 2. to receive messages from the dataplane to the overlay

Comment thread ts_dataplane/src/packet_ident.rs
Comment thread ts_dataplane/src/packet_ident.rs Outdated
}

let ethertype = pkt.get_u16();
let _geneve_rest = pkt.split_off(..4);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this assumes we never transmit any geneve options, which would slot in at 4.. in front of the inner packet.

I don't think we currently use any geneve options, but to make this split_off correct in that case we probably want to enforce a zero-length option size field, as well as the critical options bit being clear. That would make GENEVE_MASK 0xff7f I believe, only allowing the control bit to vary.

This also makes geneve packets strictly distinct from plain wireguard packets, due to enforcing pkt[0]==0 which is an invalid wireguard packet type.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we accept the control bit being set / does that get used? I assume so if you're suggesting leaving it to vary, but where this would be immediately used would expect this to be a data packet, no?

Comment thread ts_dataplane/src/async_tokio.rs

@dylan-tailscale dylan-tailscale left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall lgtm, two very minor fixes needed.

The new names for the transport "queues" are way better imo, but it still took me a minute to validate that, e.g., a Tx<ToUnderlay> was what I thought it was. I think we can bridge the gap in the ARCHITECTURE.md with just a description/diagram of what's going on between the dataplane/overlay transports/underlay transports and the various combinations of Tx<T>/Rx<T> and {To, From}{Overlay, Underlay} (obv future work)...wdyt?

Comment thread ts_dataplane/src/async_tokio.rs Outdated
Comment thread ts_dataplane/src/packet_ident.rs Outdated
@npry npry force-pushed the npry/dataplane.split_traffic branch 5 times, most recently from 9215de7 to d83b705 Compare June 30, 2026 23:19
@npry npry requested a review from danderson June 30, 2026 23:19
@npry npry force-pushed the npry/dataplane.split_traffic branch 3 times, most recently from 06fa13b to 2ece092 Compare July 1, 2026 00:50
npry added 3 commits June 30, 2026 21:09
Currently, receipt of a disco or stun packet causes the dataplane to
log errors because it doesn't understand the traffic and tries to treat
it as WireGuard. This provides comparable logic to
`wgengine/magicsock.go`'s packet identification. It is plumbed into the
dataplane in a future commit.

Signed-off-by: Nathan Perry <[email protected]>
Change-Id: I87e84518ed0adbe4ba35358cab0a3dc66a6a6964
The peer id isn't actually known a priori from the underlay, even if
some transports can give an indication of what they think it is (such as
derp reporting the nodekey). Ultimately the only way the peer identity
can be trusted is if it's cryptographically authenticated by data within
the packet, which WireGuard derives from the ongoing session (identified
by the session id field), and disco gets from the disco pubkey field in
its header. STUN packets are unverified but don't require any association
to the peer's id.

Removing this requirement will simplify the architecture for the UDP
direct transport in future commits.

Signed-off-by: Nathan Perry <[email protected]>
Change-Id: Ib3fe81d7f41fbd6e3b637f282b51f4f96a6a6964
Signed-off-by: Nathan Perry <[email protected]>
Change-Id: Ibf7d20ff1a3ab10e1277e78179285eb46a6a6964
@npry npry force-pushed the npry/dataplane.split_traffic branch from 2ece092 to ae9b0d7 Compare July 1, 2026 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants