Skip to content

Unified app framework refactor (post-release): collapse skynet srv/client, resolver, hypervisor, dmsgpty into one service contract #2775

@0pcom

Description

@0pcom

Status

Deferred until after the next release. This is a significant refactor; capturing the design discussion + decisions here so the work can resume cleanly post-release. Filed as an outcome of an in-session design exchange between operator + Alpha (claude agent) on 2026-05-23.

Problem

Today the visor has multiple parallel namespaces for things that are conceptually the same shape — a server / client / handler with identity, lifecycle, config, ports, and run-modes:

  • Launcher apps (cli visor app …): skysocks, vpn-client, skychat, skynet-web, skysocks-client-XXXX
  • Skynet (cli skynet …): per-CLI-invocation port forwarders + named srvs (skynet-3435, etc.)
  • Visor proxies (cli visor proxies …): .dmsg / .skynet resolving proxies
  • Visor built-ins: hypervisor UI, log server, dmsgctrl, dmsgpty

Each has its own config schema, its own CLI namespace, its own RPC surface, its own start/stop semantics. The fragmentation has cost:

  • No unified visibilitycli visor app ls shows nothing for skynet srvs, resolvers, or built-ins
  • No per-app route view — RouteGroups have an app_name tag set only by launcher apps; everything else dials without identity, leaving the operator with no way to ask "what routes is this thing using" or "who's connected to my skynet srv"
  • Routing controls are flag-stew per dial--routes N --min-hops K --via PK repeated on every invocation; no persistent per-app policy
  • Standalone-mode unevenness — some apps support detached operation (skychat tcp-direct, dmsgpty ssh-equivalent), others don't, and the contract isn't uniform

Target architecture

Modes taxonomy

Each app declares which of three run-modes it supports:

Mode Identity Overlay Lifecycle owner When
visor-app shares visor visor's DMSG + router visor (in-process) default
dmsg-standalone own keys, own DMSG client own DMSG client, no router user / external supervisor detached operation
TCP-standalone own keys or visor keys, no DMSG direct pk-encrypted TCP, no overlay user (visor never starts this — one exception: hypervisor) LAN-style, port-forwarding deployments

Rules:

  • Anything that can run as visor-app or over skynet can also run dmsg-standalone (skynet rides DMSG, so the capability flows down).
  • Visor never starts anything in TCP-standalone mode except the hypervisor UI, which is HTTP-only and has no overlay variant.
  • Server can dual-serve dmsg + skynet but not mix in TCP-standalone — client picks the connect path.

Per-app mode matrix

App visor-app dmsg-standalone TCP-standalone
skychat yes yes yes
skysocks / -client yes yes yes
vpn-server / -client yes yes yes
skynet-srv yes yes yes
skynet-client (persistent) yes yes yes
pty yes yes yes
hypervisor yes (= TCP) no yes (visor-starts exception)
resolver yes yes no — only exception

dmsgctrl is not an app — it's a visor-internal control protocol (DMSG-layer ping/pong on port 7).

Naming

Keep "app" as the universal user-facing term. "Services" is taken by deployment services (SD/TPD/AR/...). Internally, code may distinguish "app definition" from "app instance" where precision is needed.

CLI surface

The visor-side RPC + CLI is untouched — skywire cli proxy start/stop/status, skywire cli visor app <name>, skywire cli visor proxies all keep working. The visor-app side already has the right shape.

What changes:

  • skywire app <name> becomes canonical for all apps regardless of which run-mode they're in.
  • The dmsg-standalone implementations under cmd/dmsg/ stay invocable but are hidden from the main binary's skywire dmsg subcommand. They remain callable via go run cmd/dmsg/dmsg.go pty … (the dedicated standalone binary).
  • New cmd/standalone/ directory holds TCP-standalone-only implementations of the apps. Not imported into the main command structure; only callable via its own binary.

Net visible structure:

skywire app <name> {start|stop|status|...}          ← canonical for all apps
skywire cli proxy/visor/visor app/…                 ← visor RPC clients (unchanged)
skywire dmsg cat/curl/iperf/probe/…                 ← tools, not apps; app-flavored ones hidden

cmd/dmsg/dmsg.go {pty|socks|skychat|…}              ← dmsg-standalone (own binary)
cmd/standalone/standalone.go {pty|skychat|…}        ← TCP-standalone (own binary, new)

Startup ordering (the keystone open question)

Today's init_*.go callbacks are hand-ordered with implicit dependencies. Adding more app types with inter-dependencies (resolver needs router + discovery; hypervisor needs RPC + DMSG; log server needs DMSG + logging; pty needs DMSG; …) without an explicit dependency model makes this a tangle.

Two viable shapes:

(A) Two-phase with declared dependencies:

  1. Phase 1 (visor core, always-on, cannot be apps): logging, config, identity, DMSG client, transport manager, router, RPC, dmsgctrl. Hand-ordered as today.
  2. Phase 2 (apps via launcher): everything that's an app. Each app declares Depends() []string; launcher topologically-sorts and starts in waves.

(C) Phase-banded with explicit named phases (core / overlay / app / management): stratified version of (A) with more bands. Handles "log server before resolver" without per-pair edges.

Prefer (A) for the migration with the option to evolve toward (C) if Depends() graphs get gnarly.

Per-app-failure-tolerance must be preserved: if vpn-client fails to start, the visor still comes up today. Phase 2 needs the same semantics. Probably model as Wants: vs Requires: (systemd-style).

Disable-safety guardrail (deferred design)

PTY / hypervisor / RPC all have the property that disabling them can lock out remote management. Need a uniform critical: true flag on the appdef that gates stop behind --i-know-what-im-doing or similar. Pattern is shape-deferred; locking it in is a small follow-up after the main framework lands.

Handler registry status

pkg/visor/service_registry.go already exists as scaffold (map[uint16]ConnHandler with Register / RegisterHidden / Get / List). Per the original architecture (in project_resolver_proxy_architecture.md-era discussions), each DMSG-port service should register its inbound-connection handler there, and the sky-forwarding server should dispatch via registry.Get(port)(conn) instead of net.Dial("localhost:PORT").

Status: struct exists, is constructed at visor startup, but the per-service migration is undone. dmsgctrl / dmsgpty / log server still use the old dmsgC.Listen(port) direct pattern. Finishing the handler-registry wiring is the natural zeroth step of this refactor.

Recommended sequencing

When this resumes post-release:

  1. Finish the handler-registry wiring. Migrate dmsgctrl, dmsgpty, log server, route setup, debug endpoint into v.services.Register(port, label, handler). No new functionality; just consolidates the visor's port→handler dispatch into one place. Pre-req for everything below.
  2. Decide A vs C for startup ordering, codify Phase 1. Map current init_*.go modules to Phase 1 (always-on visor core). Anything not Phase 1 becomes an app.
  3. Define the app contract (appdef interface): identity, supported modes, Depends(), Critical(), Start/Stop/Status. Migrate launcher's internal app handling to use it.
  4. Migrate one subsystem at a time (smallest-surface-first): log server → skynet srv → persistent skynet client → resolver → pty → hypervisor.
  5. Add disable-safety guardrail (the critical: true gate) before migrating PTY / hypervisor.
  6. Once the routing surface is uniform across apps, revisit per-app routing policy / visibility — the unified cli visor app routes <name> view (showing initiator + responder RGs with mux legs per app) becomes natural.

Decisions made in the design discussion

  • Term: "app" stays as the universal name. "Service" reserved for deployment services.
  • dmsgctrl is NOT an app — visor-internal control protocol only.
  • Hypervisor UI IS an app (lean) — it's the one visor-starts-TCP exception.
  • Resolver IS an app for management/visibility uniformity, even though standalone-TCP isn't applicable to it.
  • Log server moves to Phase 2 as an autostart-by-default app (initial inclination was Phase 1; operator flipped to "make it configurable as an app").
  • PTY moves to Phase 2 as an app, with the disable-safety guardrail before the migration lands.
  • TCP-standalone is exclusively user-launched (visor doesn't start TCP) — except hypervisor UI.
  • dmsg-standalone implementations stay in code but are hidden from main skywire dmsg subcommand; remain callable via dedicated cmd/dmsg/dmsg.go binary.
  • cmd/standalone/ is the new home for TCP-standalone-only implementations.

Out of scope / deferred for this issue

  • Per-app routing policy + mux/hops controls (route-mode: auto|direct|private|mux:N:hops:K|via:PKs). Distinct downstream work that benefits from the unified app framework but isn't blocked by it.
  • Source-tag on route-groups (auto/app/operator) for autoconnect-vs-manual coexistence. Same — downstream.
  • cli visor app routes <name> unified view. Builds on this refactor.

Provenance

Filed 2026-05-23 by operator decision after a multi-turn design discussion. Original spark: "we are really lacking … a good interface for multihop / multiplexed routing control … interface to show what routes are actually in use and by what client app … well-defined automatic multihop / multiplexed routing modes … manual routing controls that don't fight with automatic modes." The discussion converged on "most of these gaps share a common root — the app framework is incomplete, so half the visor's work happens outside it." This issue captures the framework-side fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions