scarolan · scarolan · Jun 14, 2026 · Jun 14, 2026 · Jun 14, 2026 · Jun 14, 2026
diff --git a/.env.example b/.env.example
@@ -3,22 +3,30 @@ SLACK_BOT_TOKEN=
 SLACK_APP_TOKEN=
 SLACK_BOT_USER_NAME=
 
-# OpenAI API key - required unless using a local LLM (LLM_API_BASE_URL)
-OPENAI_API_KEY=
+# Required for /image image generation (Gemini "Nano Banana 2") and for the
+# Gemini chat backend if you flip to it below.
+GEMINI_API_KEY=
 
-# Optional - Local LLM configuration (OpenAI-compatible API)
-# Set these to use a local model instead of OpenAI
-# LLM_API_BASE_URL=http://kepler.local:11434/v1
-# LLM_MODEL=gemma4:latest
+# --- Chat backend --------------------------------------------------------
+# Which LLM powers Data's conversation. "ollama" (default) talks to a local
+# Ollama instance; "gemini" uses the same GEMINI_API_KEY as image generation.
+# CHAT_BACKEND=ollama
+
+# Ollama settings (used when CHAT_BACKEND=ollama)
+# OLLAMA_HOST=http://localhost:11434
+# OLLAMA_MODEL=llama3.1
+# Surface model thinking traces — requires a reasoning model (qwq, gpt-oss,
+# deepseek-r1, etc.). Accepts true|low|medium|high. When set, Data's reply
+# is rendered with an italicized context block showing the trace.
+# OLLAMA_THINK=true
+
+# Gemini settings (used when CHAT_BACKEND=gemini)
+# GEMINI_CHAT_MODEL=gemini-3-flash-latest
 
-# Required for /image image generation (Gemini "Nano Banana 2")
-GEMINI_API_KEY=
 # Optional - override the default Gemini image model
 # GEMINI_IMAGE_MODEL=gemini-3.1-flash-image
 
-# Optional
+# --- Other ---------------------------------------------------------------
 BOT_PERSONALITY=
-THINKING_MESSAGE=
 REDIS_URL=redis://localhost:6379
 MEMORY_TTL_HOURS=24
-MEMORY_MAX_KEYS=10000
diff --git a/ARCHITECTURE.md b/ARCHITECTURE.md
@@ -5,43 +5,51 @@ This document explains the high-level architecture of the `data` Slack chatbot:
 ## Goals
 - Keep the bot simple and easy to understand for new developers.
 - Provide a single source of truth for message flows, storage, and third-party integrations.
-- Document where to change behavior safely (handlers, image generation, persistence).
+- Document where to change behavior safely (handlers, image generation, persistence, chat backends).
 
 ## Components
 
 - **`app.js`** — Bolt handler wiring + `start()`. Imports pure helpers from `lib/`; only boots the bot when run as the main module so the test suite can import it safely.
 - **`lib/responses.js`** — pure matchers and content builders for canned trigger words (love-you, pod-bay-door, danceparty, tiktok, rickroll, help text, Asimov rules, dad-joke fetch).
-- **`lib/chat.js`** — `handleMessage()` and `cleanLocalLlmResponse()`. Takes the ChatGPT client and parent-id Map via `deps`.
+- **`lib/chat.js`** — `handleMessage()`. Backend-agnostic. Reads/writes per-user message history in `convoStore`, hands the chat adapter a list of messages, returns the text reply.
+- **`lib/chat-backends.js`** — `makeOllamaChat()` and `makeGeminiChat()` factories. Each returns `{ async chat({ messages }) → { text } }`. Translation between the canonical message shape and each provider's native wire format lives here.
 - **`lib/image.js`** — `generateImage()`. Takes the Gemini client and model name via `deps`.
-- **`lib/deps.js`** — `buildDeps()` factory + `validateRequiredEnv()`. Constructs the Slack `App`, Keyv/Redis store, ChatGPTAPI client, and `GoogleGenAI` client; tests can override any of them.
+- **`lib/deps.js`** — `buildDeps()` factory + `validateRequiredEnv()`. Constructs the Slack `App`, the Keyv/Redis `convoStore`, the `GoogleGenAI` client, and the selected chat adapter; tests can override any of them.
 - **Slack (Bolt JS)** — receives events in Socket Mode and dispatches them to the handlers registered by `registerHandlers(deps)`.
-- **ChatGPT client (`chatgpt` ChatGPTAPI)** — conversational responses with a persistent message store. Transparently supports any OpenAI-compatible endpoint when `LLM_API_BASE_URL` is set (e.g. Ollama at `http://kepler.local:11434/v1`).
-- **Gemini (`@google/genai`)** — image generation, model `gemini-3.1-flash-image` ("Nano Banana 2") by default; overridable via `GEMINI_IMAGE_MODEL`.
-- **Persistence (Keyv + KeyvRedis)** — stores conversation parent-message-ids keyed by user, backed by Redis (`REDIS_URL`).
-- **Tests** — 28 tests under `test/` run with `node --test`; see `test/chat.test.js` and `test/image.test.js` for the deps-injection pattern.
+- **Ollama (`ollama` npm package)** — default chat backend. Native SDK; no OpenAI compat shim. Talks to `OLLAMA_HOST` (default `http://localhost:11434`). Room to extend with `think`, `tools`, `format`, vision.
+- **Gemini (`@google/genai`)** — image generation (`gemini-3.1-flash-image`, "Nano Banana 2") and optional chat backend (`gemini-3-flash-latest`). One client serves both.
+- **Persistence (Keyv + KeyvRedis)** — stores per-user conversation history (`{role, content}` arrays) keyed by `convo:<userId>`, backed by Redis (`REDIS_URL`). TTL via `MEMORY_TTL_HOURS`. History survives process restarts.
+- **Tests** — under `test/` run with `node --test`; see `test/chat.test.js` and `test/chat-backends.test.js` for the adapter + convoStore mock patterns.
 - **CI** — GitHub Actions matrix on Node 18/20/22: install, lint, tests, syntax-check of `app.js` + every `lib/*.js`, and a non-blocking `npm audit` at high severity.
 
 ## Message flows
 
 1. Incoming message arrives via Bolt Socket Mode.
 2. The general message handler (`app.message(...)`) checks pure matchers from `lib/responses.js` in order (love-you → pod-bay → danceparty → tiktok → rickroll). On a match it calls `say()` with the helper output and returns.
-3. If no canned match and the channel is a DM or MPIM, the handler posts a "thinking" indicator, calls `handleMessage(msg, { chat, parentIds, isLocalLlm })`, deletes the indicator, and replies with the result.
+3. If no canned match and the channel is a DM or MPIM, the handler posts a "thinking" indicator, calls `handleMessage(msg, { chat, convoStore })`, deletes the indicator, and replies with the result.
 4. Direct-mention handler (`app.message(directMention(), ...)`) checks for `help`, `the rules`, `dad joke`, and image-request guidance before falling through to `handleMessage()` with the same thinking UX.
 5. Slash command `/image`:
    - `ack()` immediately to avoid Slack timeouts.
    - Respond with an ephemeral progress message.
    - Schedule the heavy work via `queueMicrotask`: call `generateImage(prompt, { client: geminiClient, model })`, then upload the returned `Buffer` to the channel with `files.uploadV2()`.
 
+## Chat backend selection
+
+`CHAT_BACKEND=ollama` (default) or `CHAT_BACKEND=gemini`. `buildDeps()` constructs the appropriate adapter and binds the system message + model at construction time, so `handleMessage` stays backend-agnostic. Adding a third backend (e.g. Anthropic) is a new factory in `lib/chat-backends.js` plus a branch in `buildDeps()`.
+
 ## Concurrency & UX
 - Image generation is deferred via `queueMicrotask` so the slash command's `ack()` returns immediately while the upload happens in the background.
 - A small thinking helper (`postThinking` / `clearThinking` in `app.js`) centralizes posting and deleting progress messages.
 
 ## Persistence & Conversation Context
-- The bot stores parent message ids per user in `Keyv` (backed by `KeyvRedis` when `REDIS_URL` is set) so follow-up messages stay in conversation context.
-- TTL and advisory `max` keys are configurable via `MEMORY_TTL_HOURS` and `MEMORY_MAX_KEYS`.
+- Per-user message history is stored in `Keyv` (backed by `KeyvRedis` when `REDIS_URL` is set) under `convo:<userId>` as a `[{role, content}, ...]` array.
+- Each turn appends `{user}` then `{assistant}` and trims to `historyLimit` (default 20 messages = 10 turns).
+- TTL is configurable via `MEMORY_TTL_HOURS` (default 24).
+- The history pointer is no longer in-process — restarts preserve every user's conversation.
 
 ## Error handling & observability
 - The app logs lifecycle events and errors with `console`. Consider a structured logger (pino/winston) for production.
+- `handleMessage` catches backend errors, distinguishes content/policy/safety/moderation errors (returns a "please rephrase" apology) from generic errors (returns a generic apology). Failed turns are not persisted into history.
 - `validateRequiredEnv()` runs at `start()` (not at module load), so missing env fails fast on boot without breaking `import` for tests.
 - Graceful shutdown handlers call `app.stop()` on `SIGINT`/`SIGTERM`/`uncaughtException`.
 
@@ -52,15 +60,16 @@ This document explains the high-level architecture of the `data` Slack chatbot:
 
 ## Extension points
 
-- **Conversation logic** — `lib/chat.js::handleMessage()`. Change preprocessing, response cleaning, or how the chat client is called.
+- **Conversation logic** — `lib/chat.js::handleMessage()`. Trim, preprocess, or change how history is built.
+- **New chat backend** — add a `make<Backend>Chat()` factory in `lib/chat-backends.js` returning `{ chat({ messages }) → { text } }`. Wire selection in `buildDeps()`. Add tests in `test/chat-backends.test.js`.
 - **New canned response** — add a matcher + content to `lib/responses.js`, write a unit test in `test/responses.test.js`, then add a call site inside `registerHandlers()` in `app.js`.
 - **New Slack slash command** — register inside `registerHandlers()` (`app.command('/your-command', ...)`) and add it to `manifest.yaml` (then re-sync the manifest to the Slack app and reinstall).
 - **Image processing** — `lib/image.js::generateImage()` encapsulates the Gemini call and is the single place to swap providers or add post-processing.
 - **New external client** — extend `buildDeps()` in `lib/deps.js` to construct (or accept an override for) the client, then pass it through `deps` to whichever helper needs it. Never instantiate clients at module scope — tests can't override module-level singletons.
 
 ## Local development & testing
 - Use `.env.example` to create a local `.env`.
-- `npm install` then `npm start` to run the bot locally (needs valid Slack and OpenAI/Gemini credentials).
+- `npm install` then `npm start` to run the bot locally (needs valid Slack and Ollama/Gemini credentials).
 - `npm test` runs the suite without requiring any env vars or network access.
 - `npm run lint` / `npm run format` for style.
 
@@ -72,3 +81,4 @@ This document explains the high-level architecture of the `data` Slack chatbot:
 - Replace `console` logging with a structured logger and optional remote export.
 - Consider an explicit Redis client passed into KeyvRedis to control connection lifecycle.
 - Cover the Bolt handlers themselves with integration tests (currently the canned-response *content* is well-tested but the registration glue in `registerHandlers()` is not).
+- Surface Ollama's native features (`think`, `tools`, `format`, vision) through the adapter interface as features call for it.