sboxai

Front-door CLI for orchestrating a Docker sandbox and driving autonomous claude/codex coding tasks inside it — defaults to the interactive Claude/Codex TUI on your subscription, with an opt-in headless mode.

What & why

By default sboxai drives the interactive Claude TUI (claude --dangerously-skip-permissions) over tmux inside the sandbox, drawing your normal Claude subscription allowance — cost-free beyond the subscription. An opt-in headless mode (claude -p) is also available for a clean structured result, but it draws the separate monthly Agent-SDK credit pool ($20 Pro / $100 Max5x / $200 Max20x as of June 2026), not your interactive subscription — that billing difference is why it's opt-in. See Execution modes for details.

A lightweight daemon (Phase 2) manages the task queue over a unix socket. Status is derived from the daemon plus live state (Docker, tmux, the filesystem).

Prerequisites

Docker Desktop with the docker sandbox plugin.
Bun (for dev and building from source).
A manual OAuth login for claude/codex performed inside the sandbox via sboxai login — it cannot be scripted.

Install

Build a single self-contained binary for your host:

make build      # → dist/sboxai

Or download a release binary from GitHub Releases. Binaries are published per platform:

Binary	Platform
`sboxai-darwin-arm64`	macOS Apple Silicon
`sboxai-darwin-x64`	macOS Intel
`sboxai-linux-x64`	Linux x86_64
`sboxai-linux-arm64`	Linux ARM64

Verify a download against the published checksums:

sha256sum -c SHA256SUMS                      # verify all binaries present
sha256sum -c --ignore-missing SHA256SUMS     # verify just the one you downloaded

Quickstart

# 1. Build/refresh the sandbox (clones the project to work on)
sboxai setup --root ~/sandbox --repo-url https://github.com/you/project.git

# 2. Authenticate claude/codex inside the sandbox (manual OAuth)
sboxai login --root ~/sandbox

# 3. Confirm the plugin, a running sandbox, and valid Claude auth
sboxai doctor --root ~/sandbox

# 4. Drive a task to completion
sboxai run "fix the failing parser test" --root ~/sandbox

Set SBOXAI_ROOT once and subsequent commands can omit --root:

export SBOXAI_ROOT=~/sandbox
sboxai doctor
sboxai run "fix the failing parser test"

Command reference

--root may be supplied via the SBOXAI_ROOT env var; --name via SBOXAI_SANDBOX (default dev).

Lifecycle

Command	Description	Flags
`setup`	Build/refresh the Docker sandbox via the vendored setup script	`--root <dir>`, `--name <name>`, `--repo-url <url>`, `--repo-branch <branch>` (default `main`)
`up`	Ensure the sandbox exists and is running (wakes a stopped one)	`--root <dir>`, `--name <name>`
`shell`	Open an interactive shell inside the sandbox	`--root <dir>`, `--name <name>`
`stop`	Stop the sandbox	`--root <dir>`, `--name <name>`
`rm`	Remove the sandbox (host `.auth/` survives)	`--root <dir>`, `--name <name>`, `--yes`

Auth

Command	Description	Flags
`login`	Drop into the sandbox shell to run `claude login` / `codex login`	`--root <dir>`, `--name <name>`
`doctor`	Read-only health checklist (exits nonzero if a critical check fails)	`--root <dir>`, `--name <name>`

Execution

Command	Description	Flags
`run <task>`	Run one autonomous task in the sandbox (sequential, synchronous)	`--root`, `--name`, `--workdir`, `--worktree`, `--repo`, `--model`, `--timeout`, `--mode tui\|headless`
`submit <task>`	Enqueue a task for daemon execution (async)	`--root`, `--name`, `--workdir`, `--worktree`, `--repo`, `--model`, `--timeout`, `--mode tui\|headless`
`mode [value]`	Show the effective execution mode, or set it (tui \| headless) — persisted to `.sboxai/config.json`	`--root`, `--name`
`cancel <id>`	Cancel a queued or running task	`--root`, `--name`
`status`	Live status: sandbox, daemon, tasks, worktrees	`--root`, `--name`
`logs <slug>`	Print (or follow) the last-run transcript log for a slug	`--root`, `--name`, `-f, --follow`, `--workdir`

Execution modes (tui | headless)

Two execution modes; the default is tui.

tui (default) — drives the interactive Claude Code TUI inside the sandbox (via the vendored tui-exec.sh over tmux). Runs under your Claude subscription, so it's cost-free beyond the subscription. Completion is inferred from the TUI going idle.
headless — runs claude -p --output-format text --dangerously-skip-permissions with the prompt piped on STDIN. Gives a clean exit code / result with no TUI screen-scraping, but draws the separate monthly Agent-SDK credit pool ($20 Pro / $100 Max5x / $200 Max20x), not your interactive subscription — hence opt-in.

Switch modes:

sboxai mode                       # show the effective mode and where it came from
sboxai mode headless              # set + persist to .sboxai/config.json
sboxai mode tui                   # switch back
sboxai run --mode headless "…"    # per-run override (also on `submit`)

Precedence: --mode flag → SBOXAI_MODE env → .sboxai/config.json → default tui.

Autonomous execution

Every run's prompt is prefixed with a contract instructing the agent to run fully unattended — never ask questions, never block on a decision, make its own judgment calls and proceed. There is no human-reply path by design; ambiguity is resolved by the agent, not surfaced to a human.

Daemon

Command	Description	Flags
`daemon start`	Start the background task daemon	`--root`, `--name`
`daemon stop`	Stop the running daemon	`--root`, `--name`
`daemon status`	Check if daemon is running	`--root`, `--name`

How it works

Interactive TUI via tmux. A task launches claude --dangerously-skip-permissions [--model X] inside a detached tmux session (harness-<pid>-<epoch>), waits for the TUI to be ready, and injects the prompt via bracketed paste so multi-line text isn't submitted line-by-line.
Best-effort completion heuristic. The executor polls the pane on an interval, hashing its content, and counts consecutive "stable" polls where the content is unchanged and no "esc to interrupt" activity indicator is present (3 by default). There is no structured result/cost signal from the interactive TUI — ok/fail is inferred from the executor exit code. Runs finish promptly when the agent goes idle: the executor detects the idle ❯ prompt (and the absence of a "working" indicator) instead of waiting out the full timeout.
Decision + transcript logs. The agent records consequential decisions (with reasoning) plus a final ## Summary in a decision log at <workdir>/.sboxai-decisions.md, alongside the per-run transcript log at <workdir>/.sboxai-last-run.log. Both paths are printed in the run's verdict line:
```
✓ task ok (rc=0) — transcript: <workdir>/.sboxai-last-run.log — decisions: <workdir>/.sboxai-decisions.md
```
Worktree per task. With --worktree <slug> --repo <repoDir>, the run gets an isolated git worktree at worktree/<slug> on branch sboxai/<slug>.
Task store. Tasks are persisted as JSON files under .sboxai/tasks/. The daemon drains the queue sequentially; status is derived from the daemon plus live Docker/tmux/filesystem state.
Exit-code contract. 0 ok, 2 error marker (e.g. rate/usage limit), 124 timeout, 3 bad input (missing/empty prompt).
Credential boundary. Org-integration credentials (Linear/GitHub/Slack) must never enter the sandbox.

Architecture

host: sboxai CLI
        │  docker sandbox exec  (env: PROMPT_FILE, TASK_TIMEOUT, CLAUDE_MODEL, TUI_WORKDIR)
        ▼
in-container: tui-exec.sh  (vendored, refreshed when stale)
        │  tmux new-session  harness-<pid>-<epoch>
        ▼
        claude --dangerously-skip-permissions   ← interactive TUI, your subscription

Roadmap

Phase 1 (done) — lifecycle (setup/up/shell/stop/rm), auth (login/doctor), and sequential execution (run/status/logs) over live-derived state.
Phase 2 (done) — long-lived daemon + unix-socket IPC + a task queue + submit/cancel/daemon commands + run goes through daemon.
Phase 3 (planned) — bounded-concurrency parallel pool + rate-limit back-pressure.

Development

make help     # list all targets
make check    # typecheck + lint + test (full gate)
make build    # compile a single host binary → dist/sboxai

Repo layout:

src/index.ts               commander entry — wires all commands
src/commands/lifecycle.ts  setup/up/shell/stop/rm
src/commands/auth.ts       login/doctor
src/commands/exec.ts       run/submit/cancel/status/logs
src/commands/daemon.ts     daemon start/stop/status
src/commands/shared.ts     shared command helpers
src/lib/config.ts          derivePaths/resolveConfig (pure)
src/lib/docker.ts          docker sandbox ls parsing + invocation
src/lib/auth.ts            OAuth credential evaluation (pure)
src/lib/worktree.ts        slugify + git worktree management
src/lib/exec.ts            host-side driver that calls tui-exec.sh
src/lib/task.ts            task model + JSON-per-task file store
src/lib/ipc.ts             IPC client helpers (HTTP-over-unix-socket)
src/lib/daemon.ts          daemon server (HTTP + queue drain loop)
src/lib/__tests__/         bun tests (incl. task.test.ts)
scripts/tui-exec.sh        vendored in-container tmux/TUI executor
scripts/sandbox-setup.sh   vendored sandbox build/setup script

Caveats

The completion heuristic (pane stability + absence of "esc to interrupt") is brittle — it scrapes the TUI rather than reading a structured signal; these are properties of the default tui mode, which headless sidesteps with a structured result at the separate-billing cost.
Future parallelism is capped (~2–3) by the subscription rate tier, not the tooling.
Container clock skew can offset transcript timestamps.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.github/workflows		.github/workflows
plan		plan
scripts		scripts
src		src
.editorconfig		.editorconfig
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
RELEASE.md		RELEASE.md
biome.json		biome.json
bun.lock		bun.lock
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

sboxai

What & why

Prerequisites

Install

Quickstart

Command reference

Lifecycle

Auth

Execution

Execution modes (tui | headless)

Autonomous execution

Daemon

How it works

Architecture

Roadmap

Development

Caveats

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

sboxai

What & why

Prerequisites

Install

Quickstart

Command reference

Lifecycle

Auth

Execution

Execution modes (tui | headless)

Autonomous execution

Daemon

How it works

Architecture

Roadmap

Development

Caveats

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages