A multi-model consensus coding assistant for your terminal.
Alpha software. Polycode is under active development and things will break. APIs, config formats, and behavior may change between releases. Bug reports and feedback are welcome — but don't rely on it for production workflows yet.
Polycode queries multiple LLMs in parallel — Anthropic Claude, OpenAI, Google Gemini, local models, or any OpenAI-compatible endpoint — and synthesizes their responses into a single authoritative answer through a designated primary model. Think of it as Claude Code, Codex, or opencode, but backed by the collective intelligence of every AI model you have access to.
Every AI model has blind spots. Claude might excel at architecture, GPT at debugging, Gemini at specific frameworks. Today you pick one and hope it's right. Polycode eliminates the trade-off: ask once, get answers from all of them, and let your strongest model synthesize the best response.
- Better answers — consensus catches errors that single models miss
- No vendor lock-in — works with any combination of providers
- Same workflow — continuous dialogue with tool execution, just like single-model assistants
- Full visibility — see what each model said before consensus
Query every configured LLM simultaneously. Responses fan out in parallel (latency = slowest provider, not sum of all), and your designated primary model synthesizes them into a single answer — identifying areas of agreement, unique insights, and errors. Providers can read files during fan-out to give codebase-aware answers.
| Provider | Type | Streaming | Tool Use | Reasoning | Auth |
|---|---|---|---|---|---|
| Anthropic Claude | anthropic |
SSE | Function calling | Extended thinking | API key |
| OpenAI (GPT, o-series) | openai |
SSE | Function calling | reasoning_effort | API key |
| Google Gemini | google |
SSE | Function calling | thinkingBudget | API key, OAuth |
| OpenAI-compatible | openai_compatible |
SSE | Function calling | reasoning_effort | API key, none |
OpenAI-compatible covers OpenRouter, Ollama, vLLM, LM Studio, Together AI, and any endpoint that speaks the OpenAI chat completions API.
Built with Bubble Tea and Lip Gloss:
- ASCII art splash screen on launch
- Scrollable chat conversation with full multi-turn dialogue
- Real-time streaming from each provider in labeled panels
- Provider activity traces — each provider tab shows the full participation trace (fan-out, synthesis, tool execution, verification) with labeled phase boundaries
- Consensus output panel with visual emphasis
- Command palette — type
/to see all commands with descriptions, filter by typing, Tab to accept - Input history — Up/Down arrows cycle through previously submitted prompts
- Consensus provenance panel — press
pto see confidence, agreements, minority reports, and routing decisions - Status bar with provider health indicators, token usage, and per-provider cost
- Toggle between consensus-only and all-panels view with
Tab
No need to edit YAML — configure everything from within the TUI:
Ctrl+Sopens the settings screen- Add providers with a step-by-step wizard (type, name, auth, model, primary)
- Edit existing providers inline
- Delete providers with confirmation (primary removal blocked until reassigned)
- Test provider connections with a single keypress
- Changes apply immediately — no restart required
Real-time visibility into how much context and budget you're consuming:
- Per-provider token counts displayed in the tab bar (
12.4K/200K) - Estimated cost per provider from litellm pricing data (e.g.,
$0.12) - Color-coded warnings at 80% (amber) and 95% (red) of context limits
- Automatic provider exclusion when context limit is reached
- Context auto-summarization — when nearing 80% of context limit, early conversation turns are compressed into a summary to free tokens
- Consensus synthesis usage tracked separately
Polycode fetches model information from litellm's model database — a community-maintained registry of 1,000+ models with context window sizes, pricing, and capability flags. No manual updates needed when new models ship.
- Cached locally with configurable TTL (default 24h)
- Three-tier limit resolution: config override > litellm data > built-in fallback
- Works offline using cached or hardcoded data
- Capability awareness: function calling, vision, reasoning support
- Pricing data used for per-provider cost estimation (
input_cost_per_token,output_cost_per_token) - Endpoint discovery for OpenAI-compatible providers: queries
GET /modelson the configured base URL to list available models, cross-referenced with litellm for capability metadata
The consensus output can drive coding actions, just like single-model assistants:
- File read — read file contents with optional line ranges (
start_line/end_line). Directories return a listing. No confirmation needed. - File edit — targeted search-and-replace with unified diff preview. Models send just the changed text instead of rewriting whole files.
- File write — create or overwrite files with unified diff preview, user confirms before applying
- File delete — delete files or empty directories with confirmation. Non-recursive by design.
- File rename — move or rename files with confirmation. Prevents accidental overwrites.
- File info — get file metadata (size, line count, text/binary type) without reading contents. Available during fan-out.
- Find files — glob-based file search (e.g.,
**/*_test.go). Returns paths only. Available during fan-out. - List directory — list directory contents, optionally recursive (up to 3 levels). Available during fan-out.
- Grep search — regex/text search with context lines, case-insensitive mode, include/exclude filters, files-only mode, configurable result limits. Available during fan-out.
- Shell exec — run commands with confirmation, hardened destructive command detection (
rm,sudo, pipe-to-shell,/dev/paths, clobber operators, and more) - Tool-use loop — the primary model can chain tool calls until done (no iteration limit, bounded by the 5-minute context timeout)
- Auto-verification — after file writes, auto-detected test suites (
go test,npm test,cargo test,pytest,make test) run automatically and report pass/fail
- API keys stored in your OS keyring (macOS Keychain, Linux secret-service)
- Encrypted file fallback when keyring is unavailable
- OAuth 2.0 device flow for providers that support it (Gemini)
- Automatic token refresh — expired OAuth tokens are refreshed using stored refresh tokens before queries fail
- No-auth mode for local models (Ollama, etc.)
Sessions persist across restarts and can be named, listed, and switched:
- Auto-saved after each query with full conversation state
- Named sessions —
/name my-featureto tag the current session - List sessions —
/sessionsshows all saved sessions with exchange counts and timestamps - Session export —
/export [path]writes the session as JSON - Consensus traces — each exchange records a full trace: routing mode, provider latencies, token usage, errors, and synthesis model
- CLI management —
polycode session list|show|deletefor headless workflows
Extend polycode with installable skills that add slash commands, system prompts, and tool definitions:
polycode skill install ./my-skill # install from local directory
polycode skill list # list installed skills
polycode skill remove my-skill # uninstallSkills live in ~/.config/polycode/skills/ with a skill.yaml manifest:
name: git-review
version: "1.0"
description: Automated git diff review
command: review # registers /review slash command
tools:
- name: diff
description: Get the current git diff
handler: git diff --cachedThree canonical example skills are included in examples/skills/:
- git-review —
/reviewfor automated diff review with structured output - test-runner —
/testfor detecting and running project test suites - security-audit —
/auditfor scanning secrets, vulnerable deps, and injection patterns
All modes query every configured provider. The mode controls synthesis depth and reasoning effort:
| Mode | Synthesis | Reasoning |
|---|---|---|
quick |
Concise, direct answer — no structured sections | Low |
balanced |
Structured synthesis — confidence, agreements, minority reports, evidence | Medium |
thorough |
Deep analysis — step-by-step reasoning, trade-offs, verification, alternatives | High |
Switch modes with /mode quick, /mode balanced, or /mode thorough, or open the mode picker from the tab bar. Reasoning effort maps to each provider's native parameter:
| Provider | Low | Medium | High |
|---|---|---|---|
| Anthropic | thinking.budget_tokens: 4096 |
10000 |
32000 |
| OpenAI | reasoning_effort: "low" |
"medium" |
"high" |
| Gemini | thinkingBudget: 4096 |
10000 |
32000 |
Models without reasoning support silently ignore the parameter.
- Hooks — Run shell commands at lifecycle events:
pre_query,post_query,post_tool,on_error - Permissions — Per-tool approval policies (
allow,ask,deny) scoped by repo or user inpermissions.yaml - MCP — Connect to Model Context Protocol servers for external tools, resources, and prompts. Full TUI wizard with curated server registry, auto-reconnect, per-call timeouts, debug logging, and runtime reconfiguration. Manage via
/mcpcommand or settings view. - Repo memory — Persistent notes in
~/.config/polycode/memory/injected into the system prompt - Instructions — Instruction hierarchy:
.polycode/instructions.md>~/.config/polycode/instructions.md> default
| Command | Description |
|---|---|
polycode |
Launch interactive TUI |
polycode init |
Setup wizard |
polycode review [--pr N] |
Review code changes via consensus |
polycode ci --pr N |
Headless PR review for CI pipelines |
polycode serve [--port N] |
Editor bridge HTTP server (default 9876) |
polycode export [--format md|json] |
Export session |
polycode import <file> |
Import session |
polycode mcp list|add|remove|test |
Manage MCP servers |
polycode mcp search <query> |
Search the MCP server registry |
polycode mcp browse |
Browse registry and install a server interactively |
polycode skill list|install|remove |
Manage skills |
polycode session list|show|delete |
Manage saved sessions |
polycode auth login|logout|status |
Manage credentials |
polycode config edit|show|path |
Manage configuration |
- Go 1.21 or later
go install github.com/izzoa/polycode/cmd/polycode@latestgit clone https://github.com/izzoa/polycode.git
cd polycode
go build -o polycode ./cmd/polycode/Polycode includes a GoReleaser config for building binaries for macOS (amd64/arm64), Linux (amd64/arm64), and Windows (amd64/arm64):
goreleaser build --snapshot --cleanpolycode initThe interactive wizard walks you through configuring providers with selectable lists for provider type, auth method, and model. For OpenAI-compatible endpoints (Ollama, vLLM, OpenRouter, etc.), the wizard automatically discovers available models from the server. Other providers show models from litellm's database with capability info. After each provider, the wizard asks if you'd like to add another.
polycodeYou'll see the polycode splash screen, then the chat interface. Start typing and press Enter. Type / to open the command palette with all available commands.
polycode provider addOr press Ctrl+S in the TUI to open settings, then a. The more providers you configure, the better the consensus.
Type / to open the command palette, or type commands directly:
| Command | Action |
|---|---|
/help |
Show keyboard shortcuts and commands |
/clear |
Clear conversation and reset context |
/save |
Save session to disk |
/copy |
Copy active tab's response to clipboard |
/share |
Copy all provider responses to clipboard as labeled markdown |
/export [path] |
Export session as JSON |
/mode <name> |
Switch mode: quick, balanced, thorough |
/plan <request> |
Run multi-model agent team pipeline |
/sessions |
List all saved sessions |
/name <name> |
Name the current session |
/memory |
View repo memory |
/skill [list|install|remove] |
Manage installed skills/plugins |
/mcp [list|status|reconnect|tools|resources|prompts|search|add|remove] |
Manage MCP servers |
/settings |
Open provider settings (+ MCP servers with Tab) |
/yolo |
Toggle auto-approve for all tool actions |
/exit |
Quit polycode |
Configuration lives at ~/.config/polycode/config.yaml (follows XDG base directory spec):
providers:
- name: claude
type: anthropic
auth: api_key
model: claude-sonnet-4-20250514
primary: true # this model synthesizes consensus
- name: gpt4
type: openai
auth: api_key
model: gpt-4o
- name: gemini
type: google
auth: api_key
model: gemini-2.5-pro
- name: deepseek
type: openai_compatible
base_url: https://openrouter.ai/api/v1
auth: api_key
model: deepseek/deepseek-r1
- name: local-llama
type: openai_compatible
base_url: http://localhost:11434/v1
model: llama3
auth: none
consensus:
timeout: 60s # max wait per provider before proceeding
min_responses: 2 # minimum responses needed before synthesizing
verify_command: "" # optional: override auto-detected test command
metadata:
cache_ttl: 24h # how often to refresh litellm model data
# url: https://... # optional: override metadata source URL
tui:
theme: dark
show_individual: true # show individual provider panels by default
mcp:
servers:
- name: filesystem
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "/path"]| Field | Required | Description |
|---|---|---|
name |
Yes | Unique identifier for the provider |
type |
Yes | anthropic, openai, google, or openai_compatible |
model |
Yes | Model identifier (e.g., claude-sonnet-4-20250514, gpt-4o) |
auth |
Yes | api_key, oauth, or none |
primary |
No | Set to true on exactly one provider — the consensus synthesizer |
base_url |
Conditional | Required for openai_compatible type |
max_context |
No | Override the detected context window limit (in tokens) |
| Field | Default | Description |
|---|---|---|
timeout |
60s |
Maximum time to wait for each provider's response |
min_responses |
2 |
Minimum successful responses before synthesis proceeds. Set to 1 if you only have one provider. |
verify_command |
auto-detected | Test command to run after file writes. Auto-detects go test, npm test, cargo test, pytest, make test. |
Add MCP servers under the mcp key in config.yaml, or use the TUI wizard (/mcp add or a in the MCP settings section):
mcp:
debug: false # log JSON-RPC traffic to ~/.config/polycode/mcp-debug.log
servers:
- name: filesystem
command: npx
args: ["-y", "@modelcontextprotocol/server-filesystem", "/path/to/project"]
read_only: true # skip confirmation for this server's tools
- name: github
command: npx
args: ["-y", "@modelcontextprotocol/server-github"]
env:
GITHUB_TOKEN: $KEYRING:mcp_github_GITHUB_TOKEN
- name: remote-server
url: http://localhost:3000/mcp # HTTP/SSE transport
timeout: 60 # per-call timeout in seconds (default 30)| Field | Required | Description |
|---|---|---|
name |
Yes | Unique server identifier |
command |
Conditional | Command to spawn (stdio transport). Required if url not set. |
args |
No | Arguments for the command |
url |
Conditional | Server URL (HTTP/SSE transport). Required if command not set. |
env |
No | Environment variables. Use $KEYRING:key_name for secrets. |
read_only |
No | Skip confirmation for this server's tools (default false) |
timeout |
No | Per-call timeout in seconds (default 30) |
MCP tools follow a split allocation based on safety:
-
Primary model (the consensus synthesizer) gets all MCP tools — both read-only and mutating. Mutating tool calls go through the confirmation gate (permission policies, yolo mode, or user prompt). This is the model that calls tools during synthesis and tool-loop execution.
-
Fan-out providers (non-primary models queried in parallel) get only read-only MCP tools. A tool is read-only if the server is configured with
read_only: trueor the tool hasreadOnlyHintin its MCP annotation. This lets secondary models safely query databases, read files, or search via MCP during the fan-out phase without side effects.
The consensus flow:
- Fan-out — all providers query in parallel, each with read-only MCP tools available
- Collect — responses gathered from all providers
- Synthesize — primary model gets the full tool set (including mutating MCP tools) and can call them during synthesis/tool loops
# Authenticate with a provider (prompts for API key or starts OAuth flow)
polycode auth login <provider-name>
# Check authentication status for all providers
polycode auth status
# Remove stored credentials
polycode auth logout <provider-name>When adding a provider via the settings wizard (Ctrl+S → a), you'll be prompted for the API key as part of the flow. Keys are stored securely in your OS keyring.
- OS keyring (preferred) — macOS Keychain, Linux secret-service/libsecret
- Encrypted file (fallback) —
~/.config/polycode/credentials.json(AES-256-GCM) when keyring is unavailable - No credentials — for
auth: noneproviders (local models)
OAuth tokens are automatically refreshed when they expire — no manual re-authentication needed.
| Key | Action |
|---|---|
Enter |
Submit prompt |
Shift+Enter |
New line in input |
↓ |
Clear input (on last line) |
Tab |
Accept command palette suggestion / toggle provider panels |
↑ / ↓ |
Cycle through input history (when input is empty) |
Ctrl+P |
Open command palette (search commands and files) |
Ctrl+E |
Open external editor to compose prompt |
Ctrl+T |
Switch color theme |
Ctrl+G |
Toggle auto-scroll lock |
Ctrl+H |
Toggle tool call concealment |
Ctrl+S |
Open settings |
@ |
Attach file to prompt (fuzzy search) |
!command |
Run shell command, inject output as context |
j/k (tab bar) |
Scroll one line |
d/u (tab bar) |
Half-page scroll |
g/G (tab bar) |
Jump to top/bottom |
y (tab bar) |
Copy last response to clipboard |
t (tab bar) |
Toggle trace expansion |
c (tab bar) |
Cancel selected provider during query |
m (tab bar) |
Toggle MCP dashboard |
p (tab bar) |
Toggle consensus provenance panel |
? |
Show help overlay |
Ctrl+C |
Quit |
| Key | Action |
|---|---|
y |
Approve |
n / Esc |
Reject |
a |
Allow for session (blocked for destructive tools) |
e |
Edit command/content before execution |
| Key | Action |
|---|---|
j / ↓ |
Move cursor down |
k / ↑ |
Move cursor up |
a |
Add new provider |
e |
Edit selected provider |
d |
Delete selected provider |
x |
Disable/enable selected provider |
t |
Test selected provider connection |
Esc |
Return to chat |
| Key | Action |
|---|---|
Enter |
Advance to next step / confirm |
j / k |
Navigate selection lists |
Esc |
Cancel wizard |
User prompt
│
├──→ Provider A (goroutine) ──→ Response A ─┐
├──→ Provider B (goroutine) ──→ Response B ──┤
├──→ Provider C (goroutine) ──→ Response C ──┤
└──→ Primary (goroutine) ──→ Response P ──┤
│
┌────────────┘
▼
Truncate to fit context
│
▼
Primary model synthesizes
(mode-aware prompt)
│
▼
Consensus response
│
▼
Tool execution (if any)
│
▼
Verification (if files written)
The primary model receives a structured prompt containing every provider's response. The prompt varies by mode — quick asks for a concise direct answer, balanced (shown below) asks for structured analysis, and thorough adds trade-off analysis, step-by-step reasoning, and cross-model verification:
You are synthesizing responses from multiple AI models to produce
the best possible answer.
Original question: How should I implement caching here?
Model responses:
---
[Model: claude]
Use an LRU cache with a TTL...
---
[Model: gpt4]
Consider Redis for distributed caching...
---
[Model: gemini]
A simple in-memory map with mutex...
---
Analyze all responses and produce a synthesis with this structure:
## Recommendation
[Your synthesized answer]
## Confidence: [high/medium/low]
## Agreement
[Points where all or most models agree]
## Minority Report
[Dissenting views worth considering]
## Evidence
[Key facts, code references, or documentation cited]
- Single provider responds — if only the primary responds, its answer is used directly (no synthesis step)
- Provider failure — failed/timed-out providers are excluded; consensus proceeds with available responses
- Below minimum threshold — if fewer than
min_responsesproviders respond, an error is shown - Context overflow — if combined responses exceed the primary's context window, they're proportionally truncated with
[truncated]markers - Auto-summarization — if conversation context reaches 80% of the primary model's limit, early turns are compressed into a summary
polycode/
├── cmd/polycode/ # CLI entry point, app wiring
│ ├── main.go # Cobra commands (root, auth, skill, session, init, review, ci, serve)
│ ├── app.go # TUI startup, subsystem wiring, conversation loop
│ ├── setup.go # Interactive setup wizard
│ ├── review.go # polycode review subcommand
│ ├── ci.go # polycode ci (headless PR review)
│ ├── serve.go # Editor bridge HTTP server
│ └── sharing.go # Export/import sessions
├── examples/ # Example skills and config templates
│ ├── skills/ # Canonical skills (git-review, test-runner, security-audit)
│ ├── permissions.yaml # Example permission policies
│ ├── instructions.md # Example project instructions
│ └── skill-manifest.yaml # Annotated skill manifest template
├── evals/ # Evaluation framework
│ ├── golden_tasks_test.go # End-to-end execution pipeline tests
│ └── review_benchmark_test.go # Seeded review quality benchmarks
├── internal/
│ ├── config/ # YAML config types, loading, validation, session persistence
│ ├── provider/ # Provider interface + 4 adapters (Anthropic, OpenAI, Gemini, OpenAI-compat)
│ ├── consensus/ # Fan-out, truncation, mode-aware synthesis pipeline
│ ├── tokens/ # Token tracking, cost estimation, model metadata (litellm)
│ ├── action/ # Tool execution (10 tools), safety guardrails, project context
│ ├── auth/ # Credential management (keyring, file, OAuth + auto-refresh)
│ ├── tui/ # Bubble Tea terminal UI (command palette, provenance, input history)
│ ├── hooks/ # Lifecycle hooks (pre_query, post_query, etc.)
│ ├── permissions/ # Per-tool approval policies (allow/ask/deny)
│ ├── routing/ # Mode-based routing with telemetry scoring
│ ├── memory/ # Repo memory + instruction hierarchy
│ ├── mcp/ # Model Context Protocol client
│ ├── skill/ # Skills/plugin system (manifest, registry, execution)
│ ├── agent/ # Agent teams (task graph, role-based workers, checkpoints)
│ └── telemetry/ # Event logging (latency, errors) for routing calibration
├── .goreleaser.yaml # Cross-platform build config
└── openspec/ # Change management artifacts
providers:
- name: claude
type: anthropic
auth: api_key
model: claude-sonnet-4-20250514
primary: true
consensus:
min_responses: 1 # no consensus needed with one providerproviders:
- name: claude
type: anthropic
auth: api_key
model: claude-sonnet-4-20250514
primary: true
- name: gpt4
type: openai
auth: api_key
model: gpt-4o
consensus:
min_responses: 2providers:
- name: claude
type: anthropic
auth: api_key
model: claude-sonnet-4-20250514
primary: true
- name: ollama
type: openai_compatible
base_url: http://localhost:11434/v1
model: llama3
auth: none
consensus:
min_responses: 2providers:
- name: claude-via-or
type: openai_compatible
base_url: https://openrouter.ai/api/v1
auth: api_key
model: anthropic/claude-sonnet-4-20250514
primary: true
- name: gpt4-via-or
type: openai_compatible
base_url: https://openrouter.ai/api/v1
auth: api_key
model: openai/gpt-4o
- name: gemini-via-or
type: openai_compatible
base_url: https://openrouter.ai/api/v1
auth: api_key
model: google/gemini-2.5-pro
consensus:
min_responses: 2Does polycode cost more than using a single model?
Yes — every query hits N providers plus a consensus synthesis call. Cost scales linearly with the number of providers. Per-provider costs are shown in the tab bar (from litellm pricing data). Use max_context overrides to limit token consumption.
Can I use it with just one provider?
Yes. Set min_responses: 1 and polycode works like a standard single-model coding assistant — no consensus step.
What if a provider is down?
Failed providers are excluded from consensus. As long as the primary and at least min_responses providers respond, the query succeeds.
Does the primary model see its own response in the consensus prompt? Yes. The primary participates in fan-out (generating its own independent response), then receives all responses (including its own) for synthesis. This lets it evaluate whether its initial answer was better or worse than alternatives.
Can I use local models?
Yes. Any model served via an OpenAI-compatible API (Ollama, vLLM, LM Studio, llama.cpp server) works with type: openai_compatible and auth: none.
Where are my API keys stored?
In your OS keyring (macOS Keychain, Linux secret-service). If the keyring isn't available, they fall back to ~/.config/polycode/credentials.json, where they are encrypted with AES-256-GCM using a locally-generated key. Keys are never stored in the config file.
Do all modes query all providers? Yes. Quick, balanced, and thorough modes all fan out to every configured provider. The mode controls how deeply the primary model analyzes and synthesizes the responses, and how much reasoning effort each provider applies.
Polycode uses OpenSpec for change management. Each feature goes through a proposal → design → specs → tasks workflow in openspec/changes/.
# Run tests
go test ./... -count=1
# Build
go build ./cmd/polycode/AGPLv3 — see LICENSE for details.