Wisemonkey is a simple, open, and hackable AI agent for the Linux and macOS terminal. It connects to any service providing an OpenAI, Anthropic, or Ollama-compatible endpoint. It features session management, persistent memory management, vector store for document embedding, native and MCP tools, skills, and much more.
The sections of this document are:
- Quickstart
- Run from source
- Configuration
- Usage and commands
- Global memory
- Rolling chat memory
- Extend agent
Wisemonkey has been tested to work on Linux and macOS.
- Python 3.13+
uvfor dependency management
Install the agent with:
curl -fsSL https://codeberg.org/langurmonkey/wisemonkey/raw/branch/master/install.sh | bashRun the agent with the default session:
wisemonkeyIf you need an API key to access the endpoint, put it in the .env file. Wisemonkey looks for the .env file in the following locations, in order:
- Current directory,
./.env - Config directory,
$XDG_CONFIG_HOME/wisemonkey/.env - Home directory,
$HOME/.env
Create the .env file with the API key:
echo "OPENAI_API_KEY=your-api-key-here" > .env
echo "ANTHROPIC_API_KEY=your-api-key-here" > .env
echo "OLLAMA_API_KEY=your-api-key-here" > .envThe agent uses
python-dotenvto load.envat startup. Theopenaipackage readsOPENAI_API_KEYfrom the environment automatically. You can also setOPENAI_API_KEYin your shell profile. Same goes forANTHROPIC_API_KEYandOLLAMA_API_KEY.
# Clone the repo, then build the project:
uv build
# Set API key:
export OPENAI_API_KEY=your-api-key
# Run the agent:
uv run wisemonkeyOn first run, the configuration is created in $XDG_CONFIG_HOME/wisemonkey/config.yaml.
It works with any OpenAI-compatible endpoint, so LM Studio, Ollama, OpenWebUI, or any other service you configure. Here are the default values:
# Wisemonkey Configuration
model:
# openai, anthropic, ollama, lmstudio, or generic
provider: generic
# Model name
name: qwen/qwen3.6-35b-a3b
# URL of OpenAI endpoint
base_url: http://127.0.0.1:1234/v1
# Temperature setting for inference
temperature: 0.8
# The reasoning effort. 'none' to disable reasoning
reasoning_effort: medium
# Show the model internal thinking
reasoning_visible: False
embedding:
# Embedding model name
name: qwen/qwen3-embedding-0.6b-gguf
# URL of the OpenAI endpoint for embeddings
base_url: http://127.0.0.1:1234/v1
agent:
max_turns: 50
system_prompt: You are a helpful assistant, expert in many domains of science and engineering. Respond concisely and clearly. No fluff. Ask for clarification if needed. Do not invent. On first interaction, analyze the user's message for their name, role, interests, and preferences. Record them with set_user_profile.
# Display formatted output at the end of generation
markdown: false
# Length of chat history kept for context, in characters
max_chat_history: 128000
# Enable vi mode input
vi_mode: falseWisemonkey also supports MCP. Use the following commands to manage the MCP integration:
/mcp: Show the current MCP configuration/mcp edit: Edit the MCP configuration file (~/.config/wisemonkey/mcp.json)/mcp tools: List all MCP tools available. Alias:/tools mcp
MCP servers are started when the agent boots. You need to restart the agent if you add new servers.
Run the agent, and then you can enter your prompt. You can use the following key bindings during input:
- Alt + Enter: add a new line
- Enter: submit the prompt
- Ctrl + q: quit
During inference, you can cancel the turn and return to the input prompt with Ctrl + c
Internally, Wisemonkey uses sessions to separate different memory histories. Sessions are named by the user. By default, the agent uses the default session. You can start in a different session (either create a new one, or restore it if it exists) with the --session argument:
# Start in a specific session
wisemonkey --session my-projectThe default session's name is default, so the following two commands are equivalent:
# These two commands start the default session
wisemonkey
wisemonkey --session defaultYou can also list the existing sessions with -ls:
# List sessions
wisemonkey --ls
Sessions:
- my-project - ~/.local/share/wisemonkey/sessions/my-project
- default - ~/.local/share/wisemonkey/sessions/defaultSessions contain:
- The input history
- Chat memory (see chat memory)
- Vector store (see document embedding)
- Notes (see session memory)
- User profile (see session memory)
For now, the configuration file is the same for all sessions.
Sessions are matched by the directory name in the sessions location (
~/.local/share/wisemonkey/sessions). You can rename a session by just renaming the directory!
You can enable vi mode for the current session with the command /vi on, or permanently in the configuration.
External editor---In vi mode, exit INSERT mode (Esc), then press v to edit your prompt in an external editor (uses your $VISUAL or $EDITOR variable).
There are a few commands available to use in the agent loop. You can list them with /help. Also, use /[command-name] help (e.g. /config help) to show additional help for a command.
Persistent memory follows XDG Base Directory spec in ~/.local/share/wisemonkey/session/$SESSION_NAME:
user_profile.json---User informationnotes.json---Persistent notes (added viasave_notetool)
Lifecycle:
- Memory is loaded into the system prompt each turn
save_notetool adds notes during a sessionsave_memorytool explicitly persists memory to disk- Memory is auto-saved when the agent exits (interactive mode)
Wisemonkey can embed documents into a per-session vector store, allowing the agent to search and reference their contents during conversation. Use /embed to add a document:
/embed ~/documents/research_paper.pdf
/embed ./notes.mdThe agent uses the search_knowledge tool to query embedded documents when answering questions about previously indexed files. Supported formats include PDF, Markdown, and plain text. Embeddings are powered by the configured embedding model and stored in the session directory under vectordb/.
In addition to persistent memory, the agent maintains a chat history of recent user input and assistant output pairs. This provides context that survives beyond the LLM's context window. Here is how it works:
- Each user message and assistant response is stored in memory
- Reasoning is omitted from chat memory
- Automatically compacted when exceeding the configured character limit
- The user can trigger the compaction any time with
/memory compact - Chat memory is attached to the system prompt on each turn
- The agent displays the last 10 exchanges, with long messages truncated
Persistence:
- Chat history is persisted to
~/.local/share/wisemonkey/session/$SESSION_NAME/chat_history.json - Automatically loaded on startup
- Saved after every exchange (user input or assistant response)
- Compacted history is also persisted to disk
Configuration:
agent:
max_chat_history: 128000 # Maximum history characters to keep for contextWisemonkey is built to be modular and hackable. Here is an overview of the main parts and their mapping to the file system.
wisemonkey/
├── agent/ # Core agent code.
│ ├── agent.py # Main agent loop, prompt handling, key bindings.
│ ├── commands.py # Slash commands (e.g. /embed, /quit).
│ ├── config.py # Configuration loading and handling.
│ ├── console.py # Rich console output with themed formatting.
│ ├── core.py # Core agent functions, like API connection and tool calls.
│ ├── mcp.py # MCP server support.
│ ├── memory.py # Session memory, paste file creation.
│ ├── router.py # API router implementation for OpenAI, Ollama, and Anthropic.
│ ├── skills.py # Skill loading and management.
│ ├── tools.py # Tool definitions.
│ ├── utils.py # Utility functions.
│ └── vectorstore.py # Vector store wrapper.
├── tools/ # Tool implementations available to the model.
│ ├── basic.py # Basic and example tools.
│ ├── files.py # File read/write tools.
│ ├── memory.py # search_knowledge tool.
│ ├── network.py # URL fetching.
│ ├── terminal.py # Shell command execution.
│ └── vectorstore.py # Vector store tool handler.
├── skills/ # Skill definitions. Add new skills here.
│ ├── example.md
│ └── rolldice.md
├── config.yaml # Default config file.
├── README.md
├── pyproject.toml
├── install.sh # Installer script.
└── .env.example
This agent is simple enough that it can be easily customized and extended by adding new tools, commands, and skills.
If you create a cool new tool, skill, or slash command, consider contributing it via a merge request!
Create a file in tools/ or use one of the existing ones. To create a tool,
create a method and decorate it with @tool(name, description, params):
from agent.tools import tool
@tool(
name="my_tool",
description="Does something useful. Be exhaustive here, as it is what the LLM will read to know about your tool.",
parameters={
"type": "object",
"properties": {
"input": {
"type": "string",
"description": "The input parameter."
}
},
"required": ["input"],
},
)
def my_handler(args):
input = args.get("input", "no input provided")
return {"result": f"{input}"}Tools are auto-discovered on startup.
The process is very similar to tools. You need to create your method, preferably in agent/commands.py, and decorate it with @cmd(name, description, aliases, examples, can_complete).
A slash command must return, in that order, ok:bool, msg:str, content:str, markdown:str:
ok: aboolindicating if the command succeeded or failed.msg: an optional short status message. It is printed withOKorERROR.content: an optionalstrwith the Python Rich-formatted content, it is printed to the output.markdown: an optionalstrformatted in Markdown, it is printed to the output.
@cmd(
"/my-command",
"This is the description",
aliases=["/mycmd"],
)
def _cmd_my_command(agent, params) -> (bool, str, str, str):
"""This command returns a message but no content"""
return True, "This is awesome!", None, NoneDecorated commands are automatically registered, and auto-completed in the input prompt.
Add a .md file in skills/ with YAML front matter, following the agentskills.io standard:
---
name: my-skill
description: What this skill does
---
# My skill
## When to use
...
## Steps
1. ...The front matter name and description are parsed and shown in the
skills list. The body is injected into the system prompt.