Unified app framework refactor (post-release): collapse skynet srv/client, resolver, hypervisor, dmsgpty into one service contract

## Status

**Deferred until after the next release.** This is a significant refactor; capturing the design discussion + decisions here so the work can resume cleanly post-release. Filed as an outcome of an in-session design exchange between operator + Alpha (claude agent) on 2026-05-23.

## Problem

Today the visor has multiple parallel namespaces for things that are conceptually the same shape — a server / client / handler with identity, lifecycle, config, ports, and run-modes:

- Launcher apps (`cli visor app …`): skysocks, vpn-client, skychat, skynet-web, skysocks-client-XXXX
- Skynet (`cli skynet …`): per-CLI-invocation port forwarders + named srvs (skynet-3435, etc.)
- Visor proxies (`cli visor proxies …`): .dmsg / .skynet resolving proxies
- Visor built-ins: hypervisor UI, log server, dmsgctrl, dmsgpty

Each has its own config schema, its own CLI namespace, its own RPC surface, its own start/stop semantics. The fragmentation has cost:

- **No unified visibility** — `cli visor app ls` shows nothing for skynet srvs, resolvers, or built-ins
- **No per-app route view** — RouteGroups have an `app_name` tag set only by launcher apps; everything else dials without identity, leaving the operator with no way to ask "what routes is this thing using" or "who's connected to my skynet srv"
- **Routing controls are flag-stew per dial** — `--routes N --min-hops K --via PK` repeated on every invocation; no persistent per-app policy
- **Standalone-mode unevenness** — some apps support detached operation (skychat tcp-direct, dmsgpty ssh-equivalent), others don't, and the contract isn't uniform

## Target architecture

### Modes taxonomy

Each app declares which of three run-modes it supports:

| Mode | Identity | Overlay | Lifecycle owner | When |
|---|---|---|---|---|
| visor-app | shares visor | visor's DMSG + router | visor (in-process) | default |
| dmsg-standalone | own keys, own DMSG client | own DMSG client, no router | user / external supervisor | detached operation |
| TCP-standalone | own keys or visor keys, no DMSG | direct pk-encrypted TCP, no overlay | user (visor never starts this — one exception: hypervisor) | LAN-style, port-forwarding deployments |

**Rules:**

- Anything that can run as visor-app or over skynet can also run dmsg-standalone (skynet rides DMSG, so the capability flows down).
- Visor never starts anything in TCP-standalone mode **except the hypervisor UI**, which is HTTP-only and has no overlay variant.
- Server can dual-serve dmsg + skynet but not mix in TCP-standalone — client picks the connect path.

### Per-app mode matrix

| App | visor-app | dmsg-standalone | TCP-standalone |
|---|---|---|---|
| skychat | yes | yes | yes |
| skysocks / -client | yes | yes | yes |
| vpn-server / -client | yes | yes | yes |
| skynet-srv | yes | yes | yes |
| skynet-client (persistent) | yes | yes | yes |
| pty | yes | yes | yes |
| hypervisor | yes (= TCP) | no | yes (visor-starts exception) |
| **resolver** | yes | yes | **no — only exception** |

dmsgctrl is **not** an app — it's a visor-internal control protocol (DMSG-layer ping/pong on port 7).

### Naming

Keep "app" as the universal user-facing term. "Services" is taken by deployment services (SD/TPD/AR/...). Internally, code may distinguish "app definition" from "app instance" where precision is needed.

### CLI surface

The visor-side RPC + CLI is untouched — `skywire cli proxy start/stop/status`, `skywire cli visor app <name>`, `skywire cli visor proxies` all keep working. The visor-app side already has the right shape.

What changes:

- `skywire app <name>` becomes canonical for all apps regardless of which run-mode they're in.
- The dmsg-standalone implementations under `cmd/dmsg/` stay invocable but are **hidden** from the main binary's `skywire dmsg` subcommand. They remain callable via `go run cmd/dmsg/dmsg.go pty …` (the dedicated standalone binary).
- New `cmd/standalone/` directory holds TCP-standalone-only implementations of the apps. Not imported into the main command structure; only callable via its own binary.

Net visible structure:

```
skywire app <name> {start|stop|status|...}          ← canonical for all apps
skywire cli proxy/visor/visor app/…                 ← visor RPC clients (unchanged)
skywire dmsg cat/curl/iperf/probe/…                 ← tools, not apps; app-flavored ones hidden

cmd/dmsg/dmsg.go {pty|socks|skychat|…}              ← dmsg-standalone (own binary)
cmd/standalone/standalone.go {pty|skychat|…}        ← TCP-standalone (own binary, new)
```

### Startup ordering (the keystone open question)

Today's `init_*.go` callbacks are hand-ordered with implicit dependencies. Adding more app types with inter-dependencies (resolver needs router + discovery; hypervisor needs RPC + DMSG; log server needs DMSG + logging; pty needs DMSG; …) without an explicit dependency model makes this a tangle.

Two viable shapes:

**(A) Two-phase with declared dependencies:**

1. **Phase 1** (visor core, always-on, cannot be apps): logging, config, identity, DMSG client, transport manager, router, RPC, dmsgctrl. Hand-ordered as today.
2. **Phase 2** (apps via launcher): everything that's an app. Each app declares `Depends() []string`; launcher topologically-sorts and starts in waves.

**(C) Phase-banded with explicit named phases** (core / overlay / app / management): stratified version of (A) with more bands. Handles "log server before resolver" without per-pair edges.

Prefer (A) for the migration with the option to evolve toward (C) if `Depends()` graphs get gnarly.

Per-app-failure-tolerance must be preserved: if vpn-client fails to start, the visor still comes up today. Phase 2 needs the same semantics. Probably model as `Wants:` vs `Requires:` (systemd-style).

### Disable-safety guardrail (deferred design)

PTY / hypervisor / RPC all have the property that disabling them can lock out remote management. Need a uniform `critical: true` flag on the appdef that gates stop behind `--i-know-what-im-doing` or similar. Pattern is shape-deferred; locking it in is a small follow-up after the main framework lands.

### Handler registry status

`pkg/visor/service_registry.go` already exists as scaffold (`map[uint16]ConnHandler` with `Register / RegisterHidden / Get / List`). Per the original architecture (in `project_resolver_proxy_architecture.md`-era discussions), each DMSG-port service should register its inbound-connection handler there, and the sky-forwarding server should dispatch via `registry.Get(port)(conn)` instead of `net.Dial("localhost:PORT")`.

**Status:** struct exists, is constructed at visor startup, but the per-service migration is undone. dmsgctrl / dmsgpty / log server still use the old `dmsgC.Listen(port)` direct pattern. Finishing the handler-registry wiring is the natural zeroth step of this refactor.

## Recommended sequencing

When this resumes post-release:

1. **Finish the handler-registry wiring.** Migrate dmsgctrl, dmsgpty, log server, route setup, debug endpoint into `v.services.Register(port, label, handler)`. No new functionality; just consolidates the visor's port→handler dispatch into one place. Pre-req for everything below.
2. **Decide A vs C for startup ordering, codify Phase 1.** Map current `init_*.go` modules to Phase 1 (always-on visor core). Anything not Phase 1 becomes an app.
3. **Define the app contract (appdef interface):** identity, supported modes, `Depends()`, `Critical()`, `Start/Stop/Status`. Migrate launcher's internal app handling to use it.
4. **Migrate one subsystem at a time** (smallest-surface-first): log server → skynet srv → persistent skynet client → resolver → pty → hypervisor.
5. **Add disable-safety guardrail** (the `critical: true` gate) before migrating PTY / hypervisor.
6. **Once the routing surface is uniform across apps,** revisit per-app routing policy / visibility — the unified `cli visor app routes <name>` view (showing initiator + responder RGs with mux legs per app) becomes natural.

## Decisions made in the design discussion

- Term: "app" stays as the universal name. "Service" reserved for deployment services.
- dmsgctrl is NOT an app — visor-internal control protocol only.
- Hypervisor UI IS an app (lean) — it's the one visor-starts-TCP exception.
- Resolver IS an app for management/visibility uniformity, even though standalone-TCP isn't applicable to it.
- Log server moves to Phase 2 as an autostart-by-default app (initial inclination was Phase 1; operator flipped to "make it configurable as an app").
- PTY moves to Phase 2 as an app, with the disable-safety guardrail before the migration lands.
- TCP-standalone is exclusively user-launched (visor doesn't start TCP) — except hypervisor UI.
- dmsg-standalone implementations stay in code but are hidden from main `skywire dmsg` subcommand; remain callable via dedicated `cmd/dmsg/dmsg.go` binary.
- `cmd/standalone/` is the new home for TCP-standalone-only implementations.

## Out of scope / deferred for this issue

- Per-app routing policy + mux/hops controls (`route-mode: auto|direct|private|mux:N:hops:K|via:PKs`). Distinct downstream work that benefits from the unified app framework but isn't blocked by it.
- Source-tag on route-groups (`auto`/`app`/`operator`) for autoconnect-vs-manual coexistence. Same — downstream.
- `cli visor app routes <name>` unified view. Builds on this refactor.

## Provenance

Filed 2026-05-23 by operator decision after a multi-turn design discussion. Original spark: "we are really lacking … a good interface for multihop / multiplexed routing control … interface to show what routes are actually in use and by what client app … well-defined automatic multihop / multiplexed routing modes … manual routing controls that don't fight with automatic modes." The discussion converged on "most of these gaps share a common root — the app framework is incomplete, so half the visor's work happens outside it." This issue captures the framework-side fix.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unified app framework refactor (post-release): collapse skynet srv/client, resolver, hypervisor, dmsgpty into one service contract #2775

Status

Problem

Target architecture

Modes taxonomy

Per-app mode matrix

Naming

CLI surface

Startup ordering (the keystone open question)

Disable-safety guardrail (deferred design)

Handler registry status

Recommended sequencing

Decisions made in the design discussion

Out of scope / deferred for this issue

Provenance

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Mode	Identity	Overlay	Lifecycle owner	When
visor-app	shares visor	visor's DMSG + router	visor (in-process)	default
dmsg-standalone	own keys, own DMSG client	own DMSG client, no router	user / external supervisor	detached operation
TCP-standalone	own keys or visor keys, no DMSG	direct pk-encrypted TCP, no overlay	user (visor never starts this — one exception: hypervisor)	LAN-style, port-forwarding deployments

App	visor-app	dmsg-standalone	TCP-standalone
skychat	yes	yes	yes
skysocks / -client	yes	yes	yes
vpn-server / -client	yes	yes	yes
skynet-srv	yes	yes	yes
skynet-client (persistent)	yes	yes	yes
pty	yes	yes	yes
hypervisor	yes (= TCP)	no	yes (visor-starts exception)
resolver	yes	yes	no — only exception

Unified app framework refactor (post-release): collapse skynet srv/client, resolver, hypervisor, dmsgpty into one service contract #2775

Description

Status

Problem

Target architecture

Modes taxonomy

Per-app mode matrix

Naming

CLI surface

Startup ordering (the keystone open question)

Disable-safety guardrail (deferred design)

Handler registry status

Recommended sequencing

Decisions made in the design discussion

Out of scope / deferred for this issue

Provenance

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions