Skip to content

firstbatchxyz/kai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kai Agent

An always-on codebase engineer that audits, optimizes, and keeps your code healthy — delivered with proof, not promises.

License: MIT Python 3.12+ Discord

Quick start · What Kai does · Proof · Run anywhere · Extend · Architecture


Kai /teleport — a terminal session hands off to a cloud sandbox and the same conversation continues in the browser

/teleport — hand your live session to a cloud sandbox and keep the conversation going from any browser. Watch the full-resolution clip →


What Kai does

Kai is an autonomous AI engineer that runs on your codebase. Talk to it like you'd talk to a senior teammate; it reads your code, picks what matters, runs the work, and comes back with verified results.

  • Code audits — finds real vulnerabilities and proves them with working exploit code, then proposes verified fixes.
  • Code optimization — hunts the most expensive functions, runs evolutionary search against a custom fitness function, and ships benchmarked improvements as PRs.
  • Codebase hygiene — unifies patterns left by different AI tools, removes dead code, fixes naming drift, keeps the codebase coherent as it grows.
  • PR reviews that run the code — simulates edge cases instead of leaving comments.
  • Continuous monitoring — schedules recurring work via cron and reports back only when something matters.
  • Run anywhere/teleport a session to a cloud sandbox and continue in the browser, or /hand-off a long task to a sandboxed agent while you keep working locally.

Everything Kai ships has been through an isolated verification harness. If it can't prove a finding or confirm a fix, it marks the result as unverified. No confidence theater.

Proof, not promises

Kai's two specialist harnesses don't just describe problems — they produce artifacts you can run and review. Both return a normalized report (harness, status, summary, findings, artifacts, metrics), so a finding always comes with the evidence attached.

Security — security_scan runs the kai-security pipeline: discover → verify → write a PoC exploit → patch. A finding only lands as verified once the exploit actually fires.

// security_scan on ./auth-service  (illustrative output)
{
  "harness": "kai_security",
  "status": "ok",
  "summary": "1 high-severity finding, verified with a working exploit",
  "findings": [{
    "severity": "high",
    "title": "JWT algorithm-confusion auth bypass",
    "location": "auth/verify_token.py:42",
    "verified": true,
    "exploit": "forge_token.py — signs an admin token with the RSA public key as an HMAC secret; the server accepts it",
    "patch": "pin algorithms=['RS256'] in jwt.decode(); reject 'none' and HS* on RSA keys"
  }],
  "artifacts": ["runs/poc/forge_token.py", "runs/patch.diff"]
}

Optimization — evolve_optimize runs AlphaEvolve-style evolutionary search (mutate → evaluate → select) against a fitness function you define, and returns the best program it found plus the metrics that beat the baseline.

// evolve_optimize on a hot path, scored by your evaluator  (illustrative output)
{
  "harness": "kai_evolve",
  "status": "ok",
  "summary": "best of 40 candidates over 12 generations",
  "metrics": { "score": 0.991, "p50_latency_ms": 31.4, "baseline_latency_ms": 88.7 },
  "artifacts": ["runs/best/best_program.py"]
}

Numbers depend entirely on your repo and fitness function — the point is that every result is backed by a runnable artifact (a PoC, a patch diff, an evolved program), not a confidence score. See Extend Kai to enable these harnesses.

Quick start

Requirements: Python 3.12+ and an LLM API key (OpenRouter / Anthropic / OpenAI). Clone with submodules — the terminal tool depends on the mini-swe-agent submodule:

git clone --recurse-submodules https://github.com/firstbatchxyz/kai-agent.git
cd kai-agent
# already cloned without --recurse-submodules? run: git submodule update --init

Local mode — run entirely on your own machine (recommended for self-host)

No account or backend needed; bring your own LLM key. Either the web UI or the terminal CLI:

pip install -e ".[ui]"     # FastAPI + the web UI extras
kai ui                     # opens http://127.0.0.1:3001 — walks you through setup

kai ui forces local mode (KAI_LOCAL_MODE=1): your key, tokens, and data live in ~/.kai-agent/ and nothing is sent to a Kai backend. See ui/README.md for make run / make dev and the local architecture. For the terminal REPL instead:

pip install -e ".[all]"
cp .env.example .env       # set OPENROUTER_API_KEY (or ANTHROPIC_API_KEY / OPENAI_API_KEY)
cp cli-config.yaml.example cli-config.yaml
python cli.py

Single-shot query instead of the interactive REPL:

python cli.py --query="Audit my top repo for exploits. Lead with the highest-impact finding."

Hosted mode (optional)

To run against the managed Kai platform (hosted audits, evolution, etc.), additionally set a KAI_JWT_TOKEN from the Kai platform in .env. The JWT is only needed for hosted features — local mode above never requires it.

More providers (z.ai / GLM, Kimi, MiniMax, Nous Portal) are configurable in cli-config.yaml. Terminal execution can run locally, in Docker, on Modal, or over SSH — see .env.example.

How it talks

  • It speaks in findings, not capabilities. After a brief intro, no more self-description.
  • It's concrete. "142K LOC across 3 services, 12 vulnerable deps, 47 changes to payment-service in 30 days" beats "I can see your repos."
  • It ranks by impact × confidence × visibility. The first thing it surfaces is the most impressive thing it can prove.
  • Read-only work (browsing repos, listing resources, checking infrastructure) happens without asking. Anything that costs credits or modifies external systems waits for your confirmation.

Run anywhere — /teleport & /hand-off

Kai's session isn't tied to your terminal. Two commands move the work to a cloud sandbox over a provider-agnostic control plane:

  • /teleport — copies your workspace and the live conversation to a sandbox, then resumes the same session in your browser at the sandbox URL (that's the demo at the top). Close the laptop; pick it back up from the web. Your local TUI stays attached and can end the sandbox at any time.
  • /hand-off — dispatches a full autonomous Kai into a sandbox as a background sub-agent. Keep talking to your local session while it works; it reports back when finished. Check status with /handoffs.

Pick the backend with KAI_SANDBOX_PROVIDER:

Provider Value Notes
E2B e2b (default) hosted micro-VM sandboxes
Vercel Sandbox vercel ephemeral sandboxes on Vercel
Local local a working dir on your machine — no cloud, great for testing
KAI_SANDBOX_PROVIDER=local python cli.py   # then: /teleport  or  /hand-off "investigate TODO.md and report back"

What Kai can reach

Tools

Kai has two sets of tools:

Built-in — always available, run inside the agent process: web search (Firecrawl), a full terminal (local / Docker / Modal / SSH), file operations, browser automation, session memory, skill management, cron scheduling, delegation, image generation, TTS, and code execution.

Kai platform (MCP) — tools the backend exposes over Model Context Protocol. A small always-on core (workspaces, repos, lifecycle actions, agent data). The rest are grouped into categories and activated on demand:

  • audit — start audits, list vulnerabilities, generate exploits
  • optimization — run evolutions, manage evaluators, monitor progress
  • repo_management — add GitHub repos, upload local code
  • budget — credits, billing, subscription usage
  • integrations — create GitHub issues, Jira tickets
  • artifacts — generate and share security reports

Skills

Skills are reusable workflow playbooks Kai can load mid-session. Out of the box Kai ships with skills for audit workflows, optimization runs, evaluator writing, vulnerability triage, continuous monitoring, serverless optimization, and more. Kai can also save new workflows it discovers as skills, so it improves at your codebase over time.

Browse skills/ for the full catalog.

Integrations

Platform Transport Use case
CLI interactive REPL or --query local dev, scripting
Slack Socket Mode / Events API team workspace, shared assistant
Discord Gateway community research bot
Telegram Bot API personal assistant
Email IMAP / SMTP async reports

Gateway adapters live under gateway/platforms/.

Extend Kai — plugins & sub-harnesses

Kai has a small hook-based plugin system (the OSS extensibility seam). A plugin is a directory with a plugin.yaml and an __init__.py that registers tools and/or lifecycle hooks — pre_tool_call, post_tool_call, transform_tool_result, on_session_start, on_session_end. Drop one into ~/.kai-agent/plugins/ (or the bundled plugins/) and it's discovered at startup. plugins/example_echo/ is a minimal template.

Two reference plugins ship in plugins/ — Kai's specialist harnesses. Each installs into its own isolated uv venv, so their heavy dependencies never touch the agent's environment, and each can run locally or in a remote sandbox through the same control plane as /hand-off:

  • kai-security (security_scan) — the security pipeline behind the example above: vulnerability discovery → verification → PoC exploit → patch.
  • kai-evolve (evolve_optimize) — AlphaEvolve-style evolutionary optimization against a fitness function, returning the best evolved program.

Enable them in ~/.kai-agent/config.yaml:

plugins:
  enabled: [kai_security, kai_evolve]

Then start with the matching toolset:

python cli.py --toolsets security    # security_scan available
python cli.py --toolsets evolve      # evolve_optimize available

Architecture

┌────────────────────────────────────────────────────────────┐
│                        Kai Agent                            │
│                                                             │
│   ┌────────────┐      ┌────────────────────────────────┐   │
│   │ Persona +  │      │  Tool registry                 │   │
│   │ guidance   │──────▶  built-in: web, terminal,       │   │
│   │ blocks     │      │  file, browser, memory, cron,   │   │
│   └────────────┘      │  execute_code, delegate …       │   │
│          │            │                                 │   │
│          ▼            │  MCP (kai_*): workspaces, repos,│   │
│   ┌────────────┐      │  audit, optimization, billing, │   │
│   │  Skills    │─────▶│  integrations, artifacts       │   │
│   │ (loaded on │      └────────────────────────────────┘   │
│   │  demand)   │                                            │
│   └────────────┘                                            │
└──────────────┬─────────────────────────────────────────────┘
               │
      ┌────────┼────────┐
      ▼        ▼        ▼
   CLI     Slack     Other gateways
  • run_agent.py — the AIAgent class; conversation loop, tool-call dispatch, context management.
  • agent/prompt_builder.py — Kai's identity and the guidance blocks that shape behavior.
  • tools/ — tool implementations + central registry.
  • toolsets.py — which tools belong to which toolset, filtering logic.
  • skills/ — workflow playbooks (each is a directory with a SKILL.md).
  • plugins/ — hook-based plugins + the bundled sub-harnesses (kai-security, kai-evolve).
  • deploy/sandbox/ — the provider-agnostic sandbox control plane behind /teleport and /hand-off.
  • gateway/ — messaging-platform adapters.
  • deploy/e2b-template/ — the E2B sandbox template Kai runs in when hosted.

Deeper dive: docs/AGENTS.md for a contributor-facing tour of the codebase.

Configuration

The essentials live in two files:

  • .env — secrets and provider keys. Copy .env.example.
  • cli-config.yaml — model defaults, provider routing, terminal backend, display preferences. Copy cli-config.yaml.example.

User-level settings (MCP server URLs, saved sessions, skills, memory, enabled plugins) live in ~/.kai-agent/. The CLI writes to this location, so it persists across updates.

Contributing

See CONTRIBUTING.md for the dev setup, what to build (and what not to), and the PR process. Priorities: bug fixes first, then cross-platform compatibility and security hardening, then skills. New tools are rare — most capabilities should be skills.

Run the test suite:

pytest

License

MIT. See LICENSE.


Kai Agent was originally forked from Hermes Agent by Nous Research (MIT). The core conversation loop and tool-dispatch architecture trace back to that project; Kai has since grown its own identity, skill system, MCP integration with the Kai platform, and domain-specific toolset.

About

Open-source autonomous coding agent that works your codebase end-to-end leveraging multiple sub-harnesses for codebase security, code & algorithm optimization, and more. Run it locally or in sandboxes in one command.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors