A tiny kernel where the processor is a language model: ten primitives, a split-screen TUI to see the states, and one idea — you extend the machine from within, in its own medium, instead of assembling an agent from modules outside. The LLM is the processor, the context window is C (working memory), everything else is deterministic harness. A paradigm to look at, not a product: build-from-within vs the framework-assembly everyone already knows.
the machine = { C, M } # its own state — this is what you'd serialize/snapshot
C = held, append-only context # working memory; one per running frame
M = durable name→value store # survives power-off (mirrored to m.json); searched, not held
the world = E # OUTSIDE the machine, not part of its state — reached only
# via perceive/act. here E has one channel: the terminal
llm = one LLM call = one step # C → (text, operations); the ONLY stochastic part
step = SEE → PARSE → RUN → APPEND → GATE
primitives (10, four concepts):
memory remember · recall · search · forget
environment perceive · act
gate wait
abstraction invoke · return # the call stack — routines as subroutines
computation exec # the deterministic coprocessor
| primitive | does |
|---|---|
remember(addr,value) |
C→M : write durable memory |
recall(addr) |
M→C : read by exact key |
search(query) |
M→C : find data/routines relevant to a query |
forget(addr) |
remove an entry from M — the only deleting op (how you change your mind) |
perceive(channel) |
E→C : read input that arrived on a channel |
act(channel,value) |
C→E : emit (e.g. speak to the user) |
wait() |
halt until the user sends input |
invoke(name,args) |
run a routine from M as a subroutine; returns a value |
return(value) |
finish a subroutine, hand a value to its caller |
exec(code,args) |
run deterministic JS on the coprocessor — exact, no LLM |
- Non-determinism is quarantined to the single
llm()call. Hold / parse / run / append / the whole datapath are exact, deterministic harness. - No program counter, no arithmetic opcodes. What the LLM recalls/perceives next is simply
where it goes; all "compute" is semantic (inside the step) or offloaded to
exec. - Frames = the call stack.
invokeruns a routine in its own fresh C (so a subroutine never clutters the caller) while sharing M and the ports;returnhands a value back. The JS call stack mirrors the frame stack. Composition (a routine invoking routines) holds several levels deep. - Build from within. A routine is just an entry in M — remember a prompt under a name (that
is the routine), then
invokeit. A prompt-routine runs as a sub-agent (flexible); an entry stored asexec:(args) => ...runs as code instantly on the coprocessor (exact, free).invokeis polymorphic — callers don't change when a routine is hardened from prompt to code. Soft routines are portable: they live inm.json, so copying it carries your capabilities. - Selective projection. The frame shows only the size of M, never its keys — the directory
is queried with
search, not held resident. This scales to any library size (O(1) per step), keeps the context clean, and makes invoking a routine deliberate instead of reflexive. (The search backend is lexical today; it swaps to embeddings/hybrid behindMemory.search.) - Context paging (consolidation). When C alone (the uncompressed working set) passes a budget,
the harness folds its oldest part into one compact episode note (
episode/Nin M) and keeps the recent tail. The last few notes ride back in as a recap — a leading message before C (recalled history, not a system instruction), sotools+systemstay a stable prefix that a fold doesn't churn. Older notes stay searchable. Automatic — the agent doesn't manage it, like an OS paging memory (the harness both detects the pressure and services the fold; the agent is the oblivious "userspace" program). This is the third memory tier between live C and deliberate M: involuntary, compressed, persisted — episodic memory, distinct from what the agent chose toremember. (Notes are compressed, so M never bloats and episodes can't nest.)
A split-screen TUI built on Ink (React for terminals) — Yoga flexbox + width-aware borders, so the frame never drifts when emoji/wide text appears. Needs a real terminal (TTY).
╭ conversation · agents ──────╮╭ machine · model ─────────────╮
│ you ▸ make a card for Anna ││ ▸ 13 greet·d2 ● run 2.5s │ ← action stream
│ ◆ machine ▸ ... ││ │ invoke upper → "HELLO.." │
│ ▶ invoke greet(Anna) ││ ▸ 14 main ● run 2.4s ↑3k │
│ ◀ greet → "HELLO ANNA" │╰──────────────────────────────╯
│ ↳ card ▸ HELLO ANNA --- … │╭ state ───────────────────────╮
│ ││ M·6 upper greet signoff … │ ← live, in place
╰─────────────────────────────╯│ C·main 23 msgs ~1.2k steps19 │
╰──────────────────────────────╯
╭ ❯ ────────────────────────────────────────────────────────────╮
│ ❯ ▮ your turn · tab:focus ↑↓ C-c │
╰────────────────────────────────────────────────────────────────╯
- Left — conversation + the agent call tree (sub-routines nested by depth).
- Right, top — the action stream: each step's operations, color-coded, nested by frame.
- Right, bottom — state — live, in place: M cells with their sizes; C vs the fold budget (
~C/budget, the foldable working set) and the real total sent (in); and a bar breakdown of where the input tokens go —tools · system · recap · C·work(onlyC·workis what folding shrinks). - Input — user input + status.
Keys: type to talk · Tab cycles focus (input → left → right) · with a pane focused, ↑↓ /
PgUp PgDn scroll and G jumps to latest · Ctrl-C quits.
git clone https://github.com/turing-machines/agent-kernel
cd agent-kernel
npm install
# ANTHROPIC_API_KEY is read from your shell, or copy .env.example → .env
cp seed.json m.json # optional: load the bundled demo routines
npm start # run in a real terminal (the TUI needs a TTY)/mem dump memory M (all cells, full values) to the stream
/c [n] dump last n messages of the main context
/wipe clear durable memory M — fresh machine
/quit power off
Anything else is user input fed to the machine on the terminal channel.
- Tell it your name, then restart (
/quit,npm start) — itsearches M and greets you with continuity. Durable identity lives inm.json. - "make me a card for Anna" — watch it
searchM, find thecardroutine, andinvokeit;cardinvokesgreetandsignoff, which invoke the code routineupper. A 3-deep call tree. - Remember a new routine ("remember a routine
shoutthat uppercases its input via exec"), then invoke it — extend the machine without leaving it. - Just chat — it should not run routines reflexively; invocation is deliberate (search → invoke).
| var | default | meaning |
|---|---|---|
MODEL |
claude-sonnet-4-6 |
the processor (the LLM) |
MAX_TOKENS |
1024 |
per-step output (context write) budget |
MAX_BURST |
16 |
max main-frame steps without yielding to the user (cost guard) |
MAX_STEPS |
12 |
per-subroutine step budget (runaway guard) |
MAX_DEPTH |
5 |
invoke recursion-depth guard |
FOLD_BUDGET |
6000 |
real input tokens before C's oldest part folds into an episode note |
FOLD_KEEP |
8 |
recent messages kept verbatim when folding |
RECAP_EPISODES |
4 |
recent episode notes surfaced as the session-so-far recap |
MEM_FILE |
m.json |
durable disk image for M |
llm.ts the processor (one LLM call = one step) — the only stochastic seam
frame.ts a running context; the step loop, op dispatch, invoke/return frames; World
memory.ts M — durable key→value store + search/forget (mirrored to m.json)
terminal.ts E — the terminal channel's input side
exec.ts the deterministic coprocessor (sandboxed JS) + code-routine marker
tools.ts the 10 primitives, as native tool definitions
system.ts the resident system text (the agent + how to use the machine)
render.ts pure formatters → ANSI strings (no I/O, no UI deps)
ui-store.ts bridge: machine logic pushes lines here; the React app subscribes
app.tsx the Ink (React) split-screen UI — the only module that knows Ink
index.ts wiring: world + frame 0 + the run loop + render(<App/>)