Kai Agent

An always-on codebase engineer that audits, optimizes, and keeps your code healthy — delivered with proof, not promises.

Quick start · What Kai does · Proof · Run anywhere · Extend · Architecture

_{/teleport — hand your live session to a cloud sandbox and keep the conversation going from any browser. Watch the full-resolution clip →}

What Kai does

Kai is an autonomous AI engineer that runs on your codebase. Talk to it like you'd talk to a senior teammate; it reads your code, picks what matters, runs the work, and comes back with verified results.

Code audits — finds real vulnerabilities and proves them with working exploit code, then proposes verified fixes.
Code optimization — hunts the most expensive functions, runs evolutionary search against a custom fitness function, and ships benchmarked improvements as PRs.
Codebase hygiene — unifies patterns left by different AI tools, removes dead code, fixes naming drift, keeps the codebase coherent as it grows.
PR reviews that run the code — simulates edge cases instead of leaving comments.
Continuous monitoring — schedules recurring work via cron and reports back only when something matters.
Run anywhere — /teleport a session to a cloud sandbox and continue in the browser, or /hand-off a long task to a sandboxed agent while you keep working locally.

Everything Kai ships has been through an isolated verification harness. If it can't prove a finding or confirm a fix, it marks the result as unverified. No confidence theater.

Proof, not promises

Kai's two specialist harnesses don't just describe problems — they produce artifacts you can run and review. Both return a normalized report (harness, status, summary, findings, artifacts, metrics), so a finding always comes with the evidence attached.

Security — security_scan runs the kai-security pipeline: discover → verify → write a PoC exploit → patch. A finding only lands as verified once the exploit actually fires.

// security_scan on ./auth-service  (illustrative output)
{
  "harness": "kai_security",
  "status": "ok",
  "summary": "1 high-severity finding, verified with a working exploit",
  "findings": [{
    "severity": "high",
    "title": "JWT algorithm-confusion auth bypass",
    "location": "auth/verify_token.py:42",
    "verified": true,
    "exploit": "forge_token.py — signs an admin token with the RSA public key as an HMAC secret; the server accepts it",
    "patch": "pin algorithms=['RS256'] in jwt.decode(); reject 'none' and HS* on RSA keys"
  }],
  "artifacts": ["runs/poc/forge_token.py", "runs/patch.diff"]
}

Optimization — evolve_optimize runs AlphaEvolve-style evolutionary search (mutate → evaluate → select) against a fitness function you define, and returns the best program it found plus the metrics that beat the baseline.

// evolve_optimize on a hot path, scored by your evaluator  (illustrative output)
{
  "harness": "kai_evolve",
  "status": "ok",
  "summary": "best of 40 candidates over 12 generations",
  "metrics": { "score": 0.991, "p50_latency_ms": 31.4, "baseline_latency_ms": 88.7 },
  "artifacts": ["runs/best/best_program.py"]
}

Numbers depend entirely on your repo and fitness function — the point is that every result is backed by a runnable artifact (a PoC, a patch diff, an evolved program), not a confidence score. See Extend Kai to enable these harnesses.

Quick start

Requirements: Python 3.12+ and an LLM API key (OpenRouter / Anthropic / OpenAI). Clone with submodules — the terminal tool depends on the mini-swe-agent submodule:

git clone --recurse-submodules https://github.com/firstbatchxyz/kai-agent.git
cd kai-agent
# already cloned without --recurse-submodules? run: git submodule update --init

Local mode — run entirely on your own machine (recommended for self-host)

No account or backend needed; bring your own LLM key. Either the web UI or the terminal CLI:

pip install -e ".[ui]"     # FastAPI + the web UI extras
kai ui                     # opens http://127.0.0.1:3001 — walks you through setup

kai ui forces local mode (KAI_LOCAL_MODE=1): your key, tokens, and data live in ~/.kai-agent/ and nothing is sent to a Kai backend. See ui/README.md for make run / make dev and the local architecture. For the terminal REPL instead:

pip install -e ".[all]"
cp .env.example .env       # set OPENROUTER_API_KEY (or ANTHROPIC_API_KEY / OPENAI_API_KEY)
cp cli-config.yaml.example cli-config.yaml
python cli.py

Single-shot query instead of the interactive REPL:

python cli.py --query="Audit my top repo for exploits. Lead with the highest-impact finding."

Hosted mode (optional)

To run against the managed Kai platform (hosted audits, evolution, etc.), additionally set a KAI_JWT_TOKEN from the Kai platform in .env. The JWT is only needed for hosted features — local mode above never requires it.

More providers (z.ai / GLM, Kimi, MiniMax, Nous Portal) are configurable in cli-config.yaml. Terminal execution can run locally, in Docker, on Modal, or over SSH — see .env.example.

How it talks

It speaks in findings, not capabilities. After a brief intro, no more self-description.
It's concrete. "142K LOC across 3 services, 12 vulnerable deps, 47 changes to payment-service in 30 days" beats "I can see your repos."
It ranks by impact × confidence × visibility. The first thing it surfaces is the most impressive thing it can prove.
Read-only work (browsing repos, listing resources, checking infrastructure) happens without asking. Anything that costs credits or modifies external systems waits for your confirmation.

Run anywhere — `/teleport` & `/hand-off`

Kai's session isn't tied to your terminal. Two commands move the work to a cloud sandbox over a provider-agnostic control plane:

/teleport — copies your workspace and the live conversation to a sandbox, then resumes the same session in your browser at the sandbox URL (that's the demo at the top). Close the laptop; pick it back up from the web. Your local TUI stays attached and can end the sandbox at any time.
/hand-off — dispatches a full autonomous Kai into a sandbox as a background sub-agent. Keep talking to your local session while it works; it reports back when finished. Check status with /handoffs.

Pick the backend with KAI_SANDBOX_PROVIDER:

Provider	Value	Notes
E2B	`e2b` (default)	hosted micro-VM sandboxes
Vercel Sandbox	`vercel`	ephemeral sandboxes on Vercel
Local	`local`	a working dir on your machine — no cloud, great for testing

KAI_SANDBOX_PROVIDER=local python cli.py   # then: /teleport  or  /hand-off "investigate TODO.md and report back"

What Kai can reach

Tools

Kai has two sets of tools:

Built-in — always available, run inside the agent process: web search (Firecrawl), a full terminal (local / Docker / Modal / SSH), file operations, browser automation, session memory, skill management, cron scheduling, delegation, image generation, TTS, and code execution.

Kai platform (MCP) — tools the backend exposes over Model Context Protocol. A small always-on core (workspaces, repos, lifecycle actions, agent data). The rest are grouped into categories and activated on demand:

audit — start audits, list vulnerabilities, generate exploits
optimization — run evolutions, manage evaluators, monitor progress
repo_management — add GitHub repos, upload local code
budget — credits, billing, subscription usage
integrations — create GitHub issues, Jira tickets
artifacts — generate and share security reports

Skills

Skills are reusable workflow playbooks Kai can load mid-session. Out of the box Kai ships with skills for audit workflows, optimization runs, evaluator writing, vulnerability triage, continuous monitoring, serverless optimization, and more. Kai can also save new workflows it discovers as skills, so it improves at your codebase over time.

Browse skills/ for the full catalog.

Integrations

Platform	Transport	Use case
CLI	interactive REPL or `--query`	local dev, scripting
Slack	Socket Mode / Events API	team workspace, shared assistant
Discord	Gateway	community research bot
Telegram	Bot API	personal assistant
Email	IMAP / SMTP	async reports

Gateway adapters live under gateway/platforms/.

Extend Kai — plugins & sub-harnesses

Kai has a small hook-based plugin system (the OSS extensibility seam). A plugin is a directory with a plugin.yaml and an __init__.py that registers tools and/or lifecycle hooks — pre_tool_call, post_tool_call, transform_tool_result, on_session_start, on_session_end. Drop one into ~/.kai-agent/plugins/ (or the bundled plugins/) and it's discovered at startup. plugins/example_echo/ is a minimal template.

Two reference plugins ship in plugins/ — Kai's specialist harnesses. Each installs into its own isolated uv venv, so their heavy dependencies never touch the agent's environment, and each can run locally or in a remote sandbox through the same control plane as /hand-off:

kai-security (security_scan) — the security pipeline behind the example above: vulnerability discovery → verification → PoC exploit → patch.
kai-evolve (evolve_optimize) — AlphaEvolve-style evolutionary optimization against a fitness function, returning the best evolved program.

Enable them in ~/.kai-agent/config.yaml:

plugins:
  enabled: [kai_security, kai_evolve]

Then start with the matching toolset:

python cli.py --toolsets security    # security_scan available
python cli.py --toolsets evolve      # evolve_optimize available

Architecture

┌────────────────────────────────────────────────────────────┐
│                        Kai Agent                            │
│                                                             │
│   ┌────────────┐      ┌────────────────────────────────┐   │
│   │ Persona +  │      │  Tool registry                 │   │
│   │ guidance   │──────▶  built-in: web, terminal,       │   │
│   │ blocks     │      │  file, browser, memory, cron,   │   │
│   └────────────┘      │  execute_code, delegate …       │   │
│          │            │                                 │   │
│          ▼            │  MCP (kai_*): workspaces, repos,│   │
│   ┌────────────┐      │  audit, optimization, billing, │   │
│   │  Skills    │─────▶│  integrations, artifacts       │   │
│   │ (loaded on │      └────────────────────────────────┘   │
│   │  demand)   │                                            │
│   └────────────┘                                            │
└──────────────┬─────────────────────────────────────────────┘
               │
      ┌────────┼────────┐
      ▼        ▼        ▼
   CLI     Slack     Other gateways

run_agent.py — the AIAgent class; conversation loop, tool-call dispatch, context management.
agent/prompt_builder.py — Kai's identity and the guidance blocks that shape behavior.
tools/ — tool implementations + central registry.
toolsets.py — which tools belong to which toolset, filtering logic.
skills/ — workflow playbooks (each is a directory with a SKILL.md).
plugins/ — hook-based plugins + the bundled sub-harnesses (kai-security, kai-evolve).
deploy/sandbox/ — the provider-agnostic sandbox control plane behind /teleport and /hand-off.
gateway/ — messaging-platform adapters.
deploy/e2b-template/ — the E2B sandbox template Kai runs in when hosted.

Deeper dive: docs/AGENTS.md for a contributor-facing tour of the codebase.

Configuration

The essentials live in two files:

.env — secrets and provider keys. Copy .env.example.
cli-config.yaml — model defaults, provider routing, terminal backend, display preferences. Copy cli-config.yaml.example.

User-level settings (MCP server URLs, saved sessions, skills, memory, enabled plugins) live in ~/.kai-agent/. The CLI writes to this location, so it persists across updates.

Contributing

See CONTRIBUTING.md for the dev setup, what to build (and what not to), and the PR process. Priorities: bug fixes first, then cross-platform compatibility and security hardening, then skills. New tools are rare — most capabilities should be skills.

Run the test suite:

pytest

License

MIT. See LICENSE.

_{Kai Agent was originally forked from Hermes Agent by Nous Research (MIT). The core conversation loop and tool-dispatch architecture trace back to that project; Kai has since grown its own identity, skill system, MCP integration with the Kai platform, and domain-specific toolset.}

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.claude		.claude
.github		.github
agent		agent
assets		assets
cron		cron
data/brief_images		data/brief_images
demos		demos
deploy		deploy
docs		docs
environments		environments
gateway		gateway
kai_cli		kai_cli
lifecycle		lifecycle
local_execution		local_execution
local_mcp		local_mcp
local_storage		local_storage
memory_graph		memory_graph
mini-swe-agent @ 07aa6a7		mini-swe-agent @ 07aa6a7
plugins		plugins
scripts		scripts
skills		skills
storage		storage
tests		tests
tools		tools
ui		ui
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.gitmodules		.gitmodules
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
cli-config.yaml.example		cli-config.yaml.example
cli.py		cli.py
docker-compose.storage.yml		docker-compose.storage.yml
e2b.Dockerfile		e2b.Dockerfile
e2b.toml		e2b.toml
e2b_build.py		e2b_build.py
e2b_build_local.py		e2b_build_local.py
e2b_template.py		e2b_template.py
kai_constants.py		kai_constants.py
kai_env.py		kai_env.py
kai_state.py		kai_state.py
kai_time.py		kai_time.py
model_tools.py		model_tools.py
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
railway.toml		railway.toml
requirements.txt		requirements.txt
run_agent.py		run_agent.py
setup-kai.sh		setup-kai.sh
toolsets.py		toolsets.py
utils.py		utils.py
uv.lock		uv.lock
workspace_context.py		workspace_context.py
workspace_context_bridge.py		workspace_context_bridge.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Kai Agent

What Kai does

Proof, not promises

Quick start

Local mode — run entirely on your own machine (recommended for self-host)

Hosted mode (optional)

How it talks

Run anywhere — `/teleport` & `/hand-off`

What Kai can reach

Tools

Skills

Integrations

Extend Kai — plugins & sub-harnesses

Architecture

Configuration

Contributing

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Kai Agent

What Kai does

Proof, not promises

Quick start

Local mode — run entirely on your own machine (recommended for self-host)

Hosted mode (optional)

How it talks

Run anywhere — /teleport & /hand-off

What Kai can reach

Tools

Skills

Integrations

Extend Kai — plugins & sub-harnesses

Architecture

Configuration

Contributing

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Run anywhere — `/teleport` & `/hand-off`

Packages