LegionForge

A local-first, security-native AI agent framework built on LangGraph.

Security is enforced in the execution path — not layered on afterward.

What Is This?

LegionForge is an open-source framework for building hardened AI agent systems on local hardware. It runs local LLMs (via Ollama) or cloud APIs (OpenAI, Anthropic), with a full security stack baked into every layer of the execution pipeline.

The one-line pitch: The hardened, self-hosted alternative to cloud agent platforms — and a security layer that other agent frameworks can plug into.

What Can It Do?

Submit any task — research, summarization, code execution, data analysis — via the web UI, a REST API, or a messaging app. Watch it execute in real time as the agent reasons, calls tools, and streams results token by token.

User interfaces:

Web UI at http://localhost:8080/ui — browser-based, no client install needed
Discord — !<task> in any channel the bot can see
Telegram — /<task> to your bot
Slack — !<task> (Socket Mode, no public URL needed)
Webhook — POST :8081/inbound from n8n, Zapier, or any HTTP client

Core Design Principles

Principle	Implementation
Fail-safe tiering	`halt → sandbox/retry → degrade` — never silent failure
Human gates on all mutations	No component autonomously changes security rules, promotes tools, or escalates privileges
Replace AI with determinism	Repeated deterministic tasks are crystallized into signed, containerized tools with zero LLM overhead
Validate at trust boundaries	Guardian enforces checks at every inbound/outbound boundary — agents are processing nodes, not trust boundaries
Privilege tied to tasks	Short-lived task tokens scoped to exactly what the current task requires

Security Architecture

Guardian Sidecar

A standalone FastAPI process (:9766) that runs a deterministic-only 7-check pipeline — no LLM calls in the hot path. Fast, auditable, unpoisonable.

Check 0: Tool revocation  (immediate halt if REVOKED)
Check 1: Tool registry + SHA-256 hash validation
Check 2: Capability boundary enforcement (negative capability list)
Check 3: Destructive pattern detection in tool args
Check 4: Agent sequence contract validation
Check 5: Ed25519 signature verification
Check 6: Adaptive threat rules (hot-reloaded every 10s — no restart needed)

Crystallization Pipeline

When agents solve the same deterministic problem repeatedly, the Observer nominates it for crystallization. The Crystallizer generates a deterministic function + test suite. A Pre-HITL Analyzer (AST guards + behavioral diff) runs before human approval. Once signed, the tool has zero LLM runtime cost.

Observer → Crystallizer → Pre-HITL Analyzer → Human gate → Ed25519-signed tool

Multi-Provider Authentication

The gateway supports five auth backends — swap without touching agent code:

Backend	Scheme	Use case
`ApiKeyBackend`	Bearer	Default — bcrypt API keys in PostgreSQL
`OIDCBackend`	Bearer	Google, Okta, Auth0, Azure AD, Keycloak, Cognito
`GitHubOAuthBackend`	Bearer	GitHub OAuth app tokens
`LDAPBackend`	Basic	OpenLDAP, Active Directory
`KerberosBackend`	Negotiate	Kerberos/GSSAPI (requires KDC + keytab)

Set gateway.auth_provider in config/hardware_profiles/mac_m4_mini_16gb.yaml.

Threat Coverage

Threat	Defense
Tool Poisoning	Hash validation at registration + cryptographic signing
Rug-Pull	Hash mismatch detection + signed tool versions
Prompt Injection (direct + indirect)	Input/output sanitizer (29 patterns, 2-tier) + RAG provenance scoring
Capability Amplification	Negative capability list enforced by Guardian
Resource Bomb / Economic DOS	Pre-execution token cost estimator + per-user daily budgets
Credential Theft	macOS Keychain storage + PII redaction from all outbound calls
RAG / Memory Poisoning	Document provenance + embedding trust scoring
Multi-Agent Cascade	Orchestrator-only routing + signed inter-agent messages
Supply Chain	AI-BOM + signed tool library
TOCTOU	`approved_snapshot` verified post-execution in `SecureToolNode`

Phase Roadmap

Phase	What Was Built	Status
0	PostgreSQL + pgvector, async LLM factory, health server	✅ Complete
1	Researcher agent, tool registry + hash validation, capability boundaries, threat event logging	✅ Complete
2	Docker containerization, Guardian security sidecar, immutable audit log (SHA-256 hash chain), RAG provenance	✅ Complete
3	JWT task tokens + ACLs, sub-agent orchestrator, sandbox retry tier	✅ Complete
4	Threat Analyst agent, adaptive Guardian rules, AI Bill of Materials	✅ Complete
5	Crystallization Pipeline — Observer + Crystallizer agents, pre-HITL analyzer, Ed25519-signed tools	✅ Complete
5.5	Security hardening: DB RBAC, AST bypass guards, tool revocation, TOCTOU mitigation, model integrity	✅ Complete
6	PentestAgent — air-gapped red-team bot, 8 attack classes × 3 variants, stop-at-proof	✅ Complete
7	Guardian feedback loop, SECURITY.md, v1.0 readiness hardening	✅ Complete
8	Gateway service (:8080), task queue, SSE streaming, web UI, A2A + MCP, Discord connector	✅ Complete
9	langchain 1.x migration, tool library (5 tools), parallel fan-out, Phase 9.5 hardening sprint	✅ Complete
10	Multi-user auth — DB-backed stream tokens, per-user daily budgets, `/usage/me`, user CLI	✅ Complete
11	SecureToolNode security fix, 38 integration tests, `AuthBackend` protocol, `Dockerfile.gateway`, `docs/SCALING.md`	✅ Complete
12	Multi-provider auth registry — `OIDCBackend`, `GitHubOAuthBackend`, `LDAPBackend`, `KerberosBackend`; multi-scheme `require_user`	✅ Complete
13	Kerberos GSSAPI real implementation, Redis-backed stream tokens, multi-instance docker-compose + Nginx	✅ Complete
14	Redis global budget counters, Prometheus `/metrics` endpoint, `X-Request-ID` middleware	✅ Complete
15	Polished web UI — localStorage key+history, cancel, tool call blocks, live timer, copy, keyboard shortcut	✅ Complete
16	Channel connectors — Telegram (polling), Slack (Socket Mode), generic Webhook (HMAC + async callback)	✅ Complete
v1.0.1	schema fix (user_id TEXT), live Kerberos KDC, all integration tests real, MODEL_INTEGRITY_STRICT env var	✅ Complete
Phase 60–381	381-tool UI library (Phases 60–381, PRs #117–#201) — full operator dashboard built as JS functions over the gateway REST API	✅ Complete
Security hardening	Extended exfiltration detection + NFKC normalization; DESTRUCTIVE_PATTERN async DB logging; PostgreSQL scram-sha-256 migration (PRs #208–#214)	✅ Complete
Web + Browser tools	`web_fetch_js` Playwright headless browser (PR #218); two-layer SSRF guard; private-IP PII regex fix; 50 tool-accuracy tests	✅ Complete
Lazy-load Dashboard	296 operator tool cards in `<template>` — injected on first click; eliminates startup parse cost (PR #217)	✅ Complete
Guardian G1–G3	`packages/legionforge_guardian` standalone package; backward-compat shim in `src/security/guardian.py`; `python -m legionforge_guardian` entry point (PR #219)	✅ Complete
Guardian G4	Public repo LegionForge/LegionForge-Guardian live; `pip install legionforge-guardian` on PyPI; auto-sync Action; Docker smoke verified (PRs #221–#232)	✅ Complete
Agent Memory — all 5 gaps	Persona bootstrap (Gap 1, DB-backed SOUL.md), user prefs (Gap 5), `memory_write`/`memory_recall` tools (Gap 3), daily episodic summaries (Gap 2), pre-compaction flush (Gap 4)	✅ Complete
Dual License	AGPLv3 open source + commercial license; `COMMERCIAL_LICENSE.md` + `CLA.md` added (PR #229)	✅ Complete

| DB security hardening | RLS fail-closed (empty user_id sees zero rows), pool hard-fail (no silent privilege escalation), log-bomb cap on threat events, Prometheus label normalization, rate-limit memory leak fix — 17 regression tests | ✅ Complete | | Phase H | Session continuity UI — persistent conversation sidebar, turn count badge, New Conversation button, session resume across page reloads | ✅ Complete | | Phase I | Multi-modal image input — paste/drag image into prompt, MIME + magic-byte validation, vision API routing, Ollama text-only fallback | ✅ Complete | | HITL approval flow | LangGraph interrupt_before operator gate — destructive tasks pause for human approval; GET /hitl/pending + POST /hitl/{id}/approve (PR #253) | ✅ Complete | | Phase J — WhatsApp | WhatsApp Business Cloud API connector — webhook ingestion, HMAC verification, message routing to gateway (PR #254) | ✅ Complete |

2247/2247 smoke tests passing. 41/41 integration tests. 5/5 Kerberos live-KDC tests. 40/40 UI tests. 104/104 TestLab tests. 79/79 tool accuracy tests. 114/114 crystallization tests. Smoke suite runs in ~21 seconds (no external services required).

Requirements

Component	Version	Notes
Python	3.11+	via pyenv recommended
PostgreSQL	16 or 17	with pgvector extension
Ollama	latest	for local LLM inference
Docker	24+	for Guardian sidecar + analyzer container
macOS	14+ (Apple Silicon)	primary target; Linux support planned

Open Security Issues

These are accepted risks for local development, tracked here as pre-1.0 release blockers. See SECURITY.md — Known Security Gaps for full details and remediation steps.

Issue	Severity	Status
~~PostgreSQL `trust` auth~~ — any local process can connect to the DB without a password	Medium (local dev) / High (shared/remote)	✅ Closed — PR #212. Now uses `peer` (Unix socket) + `scram-sha-256` (TCP). Passwords in `~/.pgpass`.
~~RLS escape hatch~~ — `app.user_id = ''` passed RLS for all rows	High	✅ Closed 2026-03-11. Empty `user_id` now sees zero rows (fail-closed).
~~Pool privilege escalation~~ — gateway/readonly pools fell back to BYPASSRLS worker on failure	High	✅ Closed 2026-03-11. All pools now raise `RuntimeError` — no silent privilege escalation.
~~Worker pool → admin fallback~~ — worker pool failure fell back to superuser DB credentials	High	✅ Closed 2026-03-11. Hard-fails on missing `legionforge_worker` role.
Key rotation stream token invalidation	Medium	Open (DB-3)
Threat rule HITL gate — worker can write `threat_rules` without approval	High	Open (SEC-1)

Demo

Screenshot (add GIF or screenshot here before v1.0 release — docs/assets/demo.gif)

The web UI at http://localhost:8080/ui streams agent output token-by-token as it reasons, calls tools, and returns results. A persistent session sidebar tracks conversation history and turn counts. Paste or drag an image directly into the prompt for multi-modal tasks.

Key UI surfaces:

Task panel — submit any prompt; watch tool calls expand inline as they execute
Session sidebar — resume past conversations; turn count badge per session
Operator dashboard — 381 live API tools for admin/monitoring (gear icon)
Chat mode — toggle to a conversational layout for back-and-forth exchanges

Quick Start

→ Full setup guide: docs/quick-start.md

# 1. Clone and set up the virtual environment
git clone https://github.com/LegionForge/LegionForge.git
cd LegionForge
python -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# 2. Set your hardware profile
export AGENT_HARDWARE_PROFILE=mac_m4_mini_16gb

# 3. Store your PostgreSQL admin password in ~/.pgpass
#    (or set POSTGRES_TRUST_AUTH=true for a default Homebrew trust-auth install)
echo "localhost:5432:*:$(whoami):yourpassword" >> ~/.pgpass && chmod 0600 ~/.pgpass

# 4. Initialize the database and generate secrets
make db-init
make setup-task-token-secret
make setup-signing-key

# 5. Run smoke tests (no services required)
make test-smoke
# Expected: 2247 passed in ~21s

# 6. Start services (three separate terminals)
make health-server   # Operator API at :8765
make gateway-start   # User API at :8080
make guardian-start  # Security sidecar at :9766 (requires Docker)

# 7. Create a user and open the web UI
make create-user USERNAME=myname
open http://localhost:8080/ui

Documentation

Document	What It Covers
`docs/quick-start.md`	Step-by-step setup, connecting channels, first task
`docs/architecture.md`	All components, ports, ASCII diagram, connection rationale
`docs/SCALING.md`	Horizontal scaling, Redis, Kerberos KDC, multi-instance Docker
`TLDR.md`	Project orientation — what was built and why
`SECURITY.md`	Threat model, HITL policy, injection detection
`VERIFICATION.md`	Verification steps for each phase
`docs/VISION.md`	Product vision, architecture rationale, design decisions

Key Files

File	Purpose
`src/base_graph.py`	LangGraph agent template — copy to create new agents
`packages/guardian/src/legionforge_guardian/app.py`	Guardian sidecar — canonical source; deterministic 7-check security pipeline
`src/security/guardian.py`	Backward-compat shim — re-exports all names from `legionforge_guardian.app`
`src/security/core.py`	Keychain loader, PII redaction (8 patterns), injection detection, I/O sanitizer
`src/database.py`	Async PostgreSQL pool, LangGraph checkpointer, pgvector, audit log hash chain
`src/safeguards.py`	Three-layer loop protection (step counter, action history, token budget)
`src/gateway/app.py`	FastAPI gateway (:8080) — task queue, SSE, web UI, A2A, MCP
`src/gateway/backends/`	Auth backend package — ApiKey, OIDC, GitHub, LDAP, Kerberos
`src/connectors/`	Channel connectors — Discord, Telegram, Slack, Webhook
`src/tools/signing.py`	Ed25519 keypair management + tool manifest signing
`src/tools/crystallization_analyzer.py`	Pre-HITL AST + behavioral diff analyzer
`config/settings.py`	Pydantic settings singleton (loaded from hardware YAML profile)
`Makefile`	All development, test, and operational commands

Makefile Reference

make check           # Verify environment before starting
make start           # Full startup (drive → Ollama → PostgreSQL → model warmup)
make test-smoke      # 2247 smoke tests, ~21s, no services required
make test-integration  # 41 integration tests (requires PostgreSQL)
make test-kerberos   # 5 Kerberos live-KDC tests (requires KDC)
make test-ui         # 40 UI tests (Playwright)
make lint            # Black formatter check
make health-server   # Start operator health API at :8765
make gateway-start   # Start user-facing gateway at :8080
make guardian-start  # Build + start Guardian sidecar via Docker at :9766
make create-user USERNAME=<name>     # Create gateway user (prints API key)
make discord-start   # Start Discord bot connector
make telegram-start  # Start Telegram bot connector
make slack-start     # Start Slack Socket Mode connector
make webhook-start   # Start generic inbound/outbound webhook connector (:8081)
make security-audit  # Smoke tests + bandit + secret scan
make pentest         # Run red-team attack suite in verify mode (stop-at-proof)
make pentest-report  # Print most recent pentest report
make audit-log-verify # Verify SHA-256 hash chain integrity on audit_log
make revoke-tool TOOL_ID=<id>  # Emergency tool revocation

Known Gaps (Accepted Residual Risk)

Embedding-level anomaly detection — RAG poisoning at the semantic vector level is an open research problem. Provenance scoring and trust flagging exist; embedding-level detection is deferred.
pip-audit / dependency hash pinning — pip-audit reports no known CVEs as of v1.0.1; transitive hash pinning is accepted residual risk.

Acknowledgements

LegionForge exists in a space shaped by several projects and thinkers worth calling out directly.

OpenClaw (Peter Steinberger — née Clawd → Clawdbot → Moltbot → OpenClaw) — the primary inspiration for LegionForge. OpenClaw proved the demand: 60,000 GitHub stars in 72 hours in January 2026. Its six-component architecture, workspace-as-files memory model, and "agent as messaging contact" UX are genuinely well-designed and directly influenced LegionForge's architecture. It also had 512 vulnerabilities found by Kaspersky post-release and active data exfiltration in third-party skills found by Cisco. LegionForge is building in the opposite order: security first, product on top. Everything in Guardian, SecureToolNode, and the security stack exists because OpenClaw made the cost of skipping that work concrete.

LangGraph — the graph execution engine underneath everything. The checkpoint-based state persistence and the recursion-limit loop protection are LangGraph primitives that LegionForge builds on heavily.

LangChain and the broader open-source LLM tooling ecosystem — without the ecosystem of open weights models, open inference runtimes (Ollama), and open tooling, a project like this on consumer hardware wouldn't be possible.

LATM — Learning to Use Tools by Making Them (Cai et al., ICLR 2024) and Voyager (Wang et al., NVIDIA 2023) — the closest published academic work to LegionForge-Anneal's crystallization pipeline. Both explore converting LLM-generated actions into reusable tools. LegionForge's contribution is the production-hardening layer: sandboxed execution, adversarial testing, Ed25519 signing, and HITL gate.

Anchor Engine by Robert S. Balch II — a deterministic semantic memory system using graph traversal (the STAR algorithm) instead of vector embeddings. Anchor Engine's core insight — that agent memory should be deterministic and explainable, not statistically fuzzy — directly informed LegionForge's temporal decay weighting in memory recall. The STAR gravity formula (similarity × e^(-λ·age)) is adapted from Anchor's whitepaper for the similarity_search temporal decay path in src/database.py.

The AI-Human Engineering Stack by Hayen Mill and Henrique Jr. Sanchez (March 2026) — a framework for thinking about the five cognitive layers of AI engineering (Prompt, Context, Intent, Judgment, Coherence) plus Evaluation and Harness as cross-cutting meta-functions. The Manus Insight from this paper — that KV-cache hit rate is the single most important production agent metric, and that context should be ordered stable-first — directly motivated the message assembly reordering in src/base_graph.py.

The security-first design of LegionForge is a direct response to watching these ecosystems grow fast and ship security as an afterthought. That's not a criticism — it's the reality of how open-source evolves. This project is an attempt to show what the stack looks like when security is the first constraint, not the last.

For the full canonical record of design influences, academic inspirations, and third-party attributions, see CREDITS.md. For machine-readable citation data, see CITATION.cff.

License

Dual License:

Open Source — AGPLv3 with Section 7(b) attribution requirement. Free for open-source use.
Commercial — proprietary license for organizations that cannot comply with AGPLv3. See COMMERCIAL_LICENSE.md.

The legionforge-guardian sidecar package is licensed separately under MIT.

Contributors must agree to the Contributor License Agreement (agreement implied by submitting a PR).

Status

v0.7.1-alpha — Phases 0–381 + H + I + J (WhatsApp) + HITL approval flow + all 5 agent memory gaps + Guardian G4 (published to PyPI) complete. 2247/2247 smoke tests. 41/41 integration tests. 5/5 Kerberos live-KDC tests. 40/40 UI tests. 114/114 crystallization tests. All pre-v1.0 security blockers resolved. Dual-licensed AGPLv3 + commercial.

Contributions, issues, and commercial licensing inquiries are welcome via GitHub Issues.

Name		Name	Last commit message	Last commit date
Latest commit History 611 Commits
.github		.github
config		config
docs		docs
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.env.secrets.example		.env.secrets.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CLA.md		CLA.md
CLAUDE.md		CLAUDE.md
COMMERCIAL_LICENSE.md		COMMERCIAL_LICENSE.md
CONTRIBUTING.md		CONTRIBUTING.md
CREDITS.md		CREDITS.md
Dockerfile.agent-base		Dockerfile.agent-base
Dockerfile.analyzer		Dockerfile.analyzer
Dockerfile.gateway		Dockerfile.gateway
Dockerfile.pentest		Dockerfile.pentest
Dockerfile.sandbox		Dockerfile.sandbox
Dockerfile.testclient		Dockerfile.testclient
Dockerfile.testlab		Dockerfile.testlab
LICENSE		LICENSE
Makefile		Makefile
NEXT.md		NEXT.md
NOTICE		NOTICE
PHASE_PLAN.md		PHASE_PLAN.md
README.md		README.md
RESEARCH.md		RESEARCH.md
SECURITY.md		SECURITY.md
SECURITY_POSTURE.md		SECURITY_POSTURE.md
TLDR.md		TLDR.md
VERIFICATION.md		VERIFICATION.md
checkpoint.md		checkpoint.md
docker-compose.multi-instance.yml		docker-compose.multi-instance.yml
docker-compose.yml		docker-compose.yml
langgraph.json		langgraph.json
make-targets.md		make-targets.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements-testclient.txt		requirements-testclient.txt
requirements.analyzer.txt		requirements.analyzer.txt
requirements.lock		requirements.lock
requirements.txt		requirements.txt
uat_comparison_log.md		uat_comparison_log.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

LegionForge

What Is This?

What Can It Do?

Core Design Principles

Security Architecture

Guardian Sidecar

Crystallization Pipeline

Multi-Provider Authentication

Threat Coverage

Phase Roadmap

Requirements

Open Security Issues

Demo

Quick Start

Documentation

Key Files

Makefile Reference

Known Gaps (Accepted Residual Risk)

Acknowledgements

License

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

LegionForge

What Is This?

What Can It Do?

Core Design Principles

Security Architecture

Guardian Sidecar

Crystallization Pipeline

Multi-Provider Authentication

Threat Coverage

Phase Roadmap

Requirements

Open Security Issues

Demo

Quick Start

Documentation

Key Files

Makefile Reference

Known Gaps (Accepted Residual Risk)

Acknowledgements

License

Status

About

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages