A formally verified ReAct agent implemented in ACL2 with FTY types. The agent's decision logic is mathematically proven correct, while external tools (LLMs, code execution) are accessed via the Model Context Protocol (MCP).
This project demonstrates how to build AI agents with formally verified decision logic:
- Proven Safety: ACL2 proves that the agent respects permissions, stays within budget, and terminates
- Proven Correctness: State transitions preserve invariants; context management preserves system prompts
- Practical Integration: LLM integration via local (LM Studio) or cloud (OpenAI) providers, code execution via MCP
- ✅ FTY-typed agent state with step counters, budgets, permissions, and conversation history
- ✅ Permission model with file access levels and code execution controls
- ✅ Budget tracking for tokens and time
- ✅ Context management with sliding window truncation that preserves system prompts
- ✅ Proven termination via max-steps bound
- ✅ MCP integration for ACL2 code execution with persistent sessions
- ✅ LLM integration via local (LM Studio) or cloud (OpenAI) providers
- ✅ Cloud provider support with OpenAI, custom endpoints (Anthropic, Azure planned)
- ✅ Parinfer integration to auto-fix unbalanced parens in LLM-generated code
All theorems are in verified-agent.lisp unless noted otherwise.
| Theorem | What It Proves |
|---|---|
permission-safety |
Tool invocation requires permission |
budget-bounds-after-deduct |
Budgets remain non-negative after deduction |
termination-by-max-steps |
Reaching max-steps forces agent to respond |
remaining-steps-decreases-after-increment |
Step counter progress guarantees termination |
error-state-forces-must-respond |
Internal errors halt the agent |
| Theorem | What It Proves |
|---|---|
continue-respond-partition |
Agent is always in exactly one of: must-respond, should-continue, or satisfied |
step-increases-after-increment |
Step counter strictly increases each iteration |
| Theorem | What It Proves |
|---|---|
add-tool-result-preserves-error-state |
Tool results don't change internal error state |
add-tool-result-preserves-has-error-p |
Tool results don't affect error status |
add-tool-result-preserves-done |
Tool results don't change done flag |
add-assistant-msg-preserves-must-respond-p |
Assistant messages don't change termination status |
Context Management (context-manager.lisp)
| Theorem | What It Proves |
|---|---|
truncate-preserves-system-prompt |
System message survives context truncation |
truncate-to-fit-length-bound |
Truncated list never exceeds original length |
drop-oldest-until-fit-is-sublist |
Dropped messages are a sublist of original |
add-message-returns-list |
Adding messages preserves list type |
| Axiom | What It Guarantees |
|---|---|
external-tool-call-returns-list |
Tool calls return a proper list |
external-tool-call-bounded |
Response length is bounded (resource safety) |
| Theorem | What It Proves |
|---|---|
react-step-preserves-agent-state |
ReAct step returns valid agent state |
deduct-preserves-agent-state |
Budget deduction returns valid agent state |
increment-preserves-agent-state |
Step increment returns valid agent state |
Note on error handling: Tool execution errors are not internal errors—they are returned to the agent as messages so it can see and recover from them. The
add-tool-result-preserves-*theorems prove this is safe. Only infrastructure failures (LLM unreachable, budget exhausted) halt the loop.
- VS Code with Dev Containers extension
- Docker
- LM Studio (optional, for local LLM)
- OpenAI API Key (optional, for cloud LLM)
git clone https://github.com/YOUR_USERNAME/verified-agent.git
cd verified-agent
code .
# When prompted, click "Reopen in Container"cd src
cert.pl verified-agent.lisppip install mcp-proxy
mcp-proxy acl2-mcp --transport streamablehttp --port 8000 --pass-environmentStart LM Studio with a model loaded, then in ACL2:
(ld "chat-demo.lisp")
(interactive-chat-loop *agent-v1* *model-id* state)You can use OpenAI instead of a local LLM:
(ld "chat-openai.lisp")
;; Quick start with your API key
(chat-with-openai "sk-your-api-key-here" state)
;; Or use GPT-4o for best results
(chat-with-gpt4o "sk-your-api-key-here" state)Available models: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo, etc.
┌─────────────────────────────────────────────────────────────────┐
│ Verified Agent (ACL2) │
│ Proven: permission-safety, budget-bounds, termination │
├─────────────────────────────────────────────────────────────────┤
│ ReAct Loop: react-step → LLM → extract code → execute → loop │
├─────────────────────────────────────────────────────────────────┤
│ Context Manager: truncate-to-fit preserves system prompt │
├─────────────────────────────────────────────────────────────────┤
│ Decision Functions: can-invoke-tool-p, must-respond-p │
├─────────────────────────────────────────────────────────────────┤
│ FTY Types: agent-state, tool-spec, chat-message, error-kind │
└─────────────────────────────────────────────────────────────────┘
│
│ MCP / HTTP
▼
┌─────────────────────────────────────────────────────────────────┐
│ External Tools: acl2-mcp (code execution), LM Studio (LLM) │
└─────────────────────────────────────────────────────────────────┘
verified-agent/
├── .devcontainer/
│ └── devcontainer.json # Dev container config for ACL2 environment
├── .github/
│ └── copilot-instructions.md # AI assistant guidance
├── src/ # ACL2 source files
│ ├── verified-agent.lisp # Core: FTY types, decision functions, safety theorems
│ ├── context-manager.lisp # Conversation history with truncation proofs
│ ├── llm-types.lisp # FTY types for chat messages
│ ├── llm-client.lisp # HTTP client for LM Studio
│ ├── llm-client-raw.lsp # Raw Lisp JSON serialization
│ ├── http-json.lisp # HTTP POST/GET with JSON
│ ├── http-json-raw.lsp # Raw Lisp HTTP implementation
│ ├── mcp-client.lisp # MCP JSON-RPC client
│ ├── mcp-client-raw.lsp # Raw Lisp MCP serialization
│ ├── agent-runner.lisp # Runtime driver for code execution
│ ├── parinfer-fixer.lisp # Fix unbalanced parens in LLM output
│ ├── chat-demo.lisp # Interactive demo (local LLM)
│ ├── chat-openai.lisp # Interactive demo (OpenAI cloud)
│ └── Verified_Agent_Spec.md # Full specification
├── acl2-mcp/ # Python MCP server
│ ├── acl2_mcp/
│ │ ├── __init__.py
│ │ └── server.py # MCP server (15 tools for ACL2)
│ ├── pyproject.toml
│ ├── README.md
│ ├── LICENSE
│ └── SECURITY.md
├── .gitignore
├── CLAUDE.md # Quick context for AI assistants
├── LICENSE # BSD 3-Clause
├── Makefile # Build automation
└── README.md # This file
(fty::defprod agent-state
((step-counter natp :default 0)
(max-steps natp :default 100)
(token-budget natp :default 10000)
(time-budget natp :default 3600)
(file-access natp :default 0) ; 0=none, 1=read, 2=write
(execute-allowed booleanp :default nil)
(messages chat-message-list-p :default nil)
(satisfaction natp :default 0)
(done booleanp :default nil)
(error-state error-kind-p :default '(:none)))
:layout :list)The agent decides what to do based on pure functions:
;; Can we invoke this tool?
(can-invoke-tool-p tool st) = (tool-permitted-p tool st)
AND (tool-budget-sufficient-p tool st)
;; Must we stop?
(must-respond-p st) = done OR has-error OR (step-counter >= max-steps)
OR (token-budget = 0) OR (time-budget = 0)The agent can execute ACL2 code through the MCP protocol:
;; LLM writes code in markdown blocks
;; ```acl2
;; (+ 1 2 3)
;; ```
;; Agent extracts and executes via MCP
(mcp-acl2-evaluate conn "(+ 1 2 3)") ; => "6"The agent supports both local and cloud LLM providers:
| Provider | Description | Setup |
|---|---|---|
| Local (LM Studio) | Run models locally on your machine | Install LM Studio, load a model |
| OpenAI | Cloud-hosted GPT-4, GPT-3.5, etc. | Get API key from OpenAI |
| Custom | Any OpenAI-compatible API | Provide endpoint URL |
Provider Configuration:
;; Local LM Studio (default)
(make-local-provider-config "model-name")
;; OpenAI
(make-openai-provider-config "sk-..." "gpt-4o-mini")
;; Custom OpenAI-compatible endpoint
(make-custom-provider-config "https://my-api.com/v1/chat/completions"
"api-key"
"model-name")Using a provider:
;; Single chat completion
(llm-chat-completion-with-provider config messages state)
;; Interactive chat loop
(interactive-chat-loop-with-provider agent-state config state)LLMs often generate Lisp code with unbalanced parentheses, even though the indentation is correct. The agent uses parinfer-rust to automatically fix these errors before execution:
;; LLM output (missing closing parens):
(defun factorial (n)
(if (zp n)
1
(* n (factorial (1- n)
;; After parinfer fix:
(defun factorial (n)
(if (zp n)
1
(* n (factorial (1- n)))))Install parinfer-rust:
make install-parinfer # Installs Rust + parinfer-rust from GitHub
make test-parinfer # Verify installation# Certify all books
cd src && cert.pl verified-agent.lisp
# Run MCP server tests
cd acl2-mcp && python -m pytest# Via mcp-proxy for HTTP transport
mcp-proxy acl2-mcp --transport streamablehttp --port 8000 --pass-environment- Verify the decision logic, not the world — ACL2 proves properties about how the agent decides, given any external responses
- FTY over STObj — Cleaner types, auto-generated theorems, easier reasoning
- MCP for external tools — Standard protocol for tool integration
- Keep verified core simple — Complex I/O in external driver, proofs in ACL2
- Fix LLM output with parinfer — Automatically correct unbalanced parens using indentation
BSD 3-Clause License. See LICENSE.
- Built with ACL2
- LLM integration via LM Studio
- MCP implementation using MCP Python SDK
- Paren fixing via parinfer-rust