Glean is a local-first knowledge engine with a Rust core. This repository is still early-stage; the current milestone focuses on a working Cargo workspace and an MCP-compatible stdio server (glean mcp).
| Crate | Role |
|---|---|
packages/core |
Indexing engine: LanceDB, SQLite shadow, GleanEngine, pipeline |
packages/host |
Host runtime: daemon loop, MCP router, config editor, status |
apps/cli |
Single glean binary: Clap, stdio MCP transport, glean daemon, glean index |
apps/desktop |
Tauri 2 + Vite/React desktop UI; read-only search via glean-host, indexing via sidecar glean daemon |
Prerequisites: Rust stable 1.91+ (rustup), cargo, and protoc (Protocol Buffers compiler - required by LanceDB's Rust dependency chain). On macOS: brew install protobuf; on Debian/Ubuntu: sudo apt-get install protobuf-compiler.
Full workspace builds (cargo test --workspace, cargo clippy --workspace) also compile glean-desktop (Tauri). On Linux, install WebKitGTK / related dev packages so pkg-config finds glib-2.0 (see the apt-get install list in .github/workflows/rust.yml).
cargo build -p glean-cli --releaseThe binary is emitted as target/release/glean.
See apps/desktop/README.md. Releases: run pnpm tag on main, then git push origin main and git push origin vX.Y.Z — CI builds macOS Apple Silicon (.dmg), Windows NSIS (.exe), and standalone glean-* CLI binaries. See ops notes (local).
Quick start:
cargo build -p glean-cli # sidecar binary at target/debug/glean
pnpm install
pnpm --filter @glean/desktop tauri devRun the MCP server on stdin/stdout:
glean mcpThe server reads newline-delimited JSON: each line must be a full JSON-RPC 2.0 object (jsonrpc, method, and for requests an id). Typing plain text such as initialize is not valid JSON, so you will see Parse error (-32700) on stdout.
INFO lines from lance::dataset_events on stderr while the process starts are normal: the engine opens the Lance dataset before the read loop; they are not protocol traffic.
One-shot check (one request, then EOF):
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' | ./target/release/glean mcpExpect a single JSON line on stdout with a result payload (capabilities + serverInfo). Two requests on two lines:
printf '%s\n%s\n' \
'{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}' \
'{"jsonrpc":"2.0","id":2,"method":"tools/list"}' \
| ./target/release/glean mcpMCP behaviour is awkward to poke interactively because stdin needs JSON lines. Use the bundled tests instead:
- Fast in-process tests (
handle_json_line):
cargo test -p glean-host mcp::router - Real subprocess + temp storage (
CARGO_BIN_EXE_gleanstdio framing):
cargo test -p glean-cli --test mcp_subprocess
The router tests cover initialize, invalid JSON / plaintext lines, unknown methods, tools/list, and tools/call/search_semantic/search_files round-trips with indexed content.
Use command + args form when your client splits arguments:
- Command:
/absolute/path/to/target/release/glean - Args:
mcp
Some UIs accept a single command string (/path/to/glean mcp); use whichever your editor supports.
This server speaks newline-delimited JSON-RPC 2.0 with MCP-shaped methods:
initializetools/list:search_semantic,search_files,read_file_context,get_recent_changes,get_graph_topologytools/call:search_semantic(hybrid + optional rerank; needscontent_state=ready),search_files(SQLitefs_entries; partial results during initial walk; no embedder),read_file_context,get_recent_changes(discovered files bymtime_ns, not vector-gated),get_graph_topology(Lite Graph after content ingest). MCP does not run the indexer—useglean daemon.
Environment:
GLEAN_STORAGE_ROOT: global home —config.toml,cache/embedding/(FastEmbedmodel.onnx+ tokenizer downloads),cache/reranker/,logs/(defaults to~/.glean).GLEAN_WORKSPACE_ROOT: workspace root for MCP / daemon (optional; defaults to cwd). Per-project index lives at<workspace>/.glean/(metadata/index.db,vectors/only — no config file there).GLEAN_LOG: optional filter fortracingon stderr and rolling files (same syntax astracing_subscriber::EnvFilter, e.g.info,glean_core=debug). If unset: MCP /glean statususe info on both stderr and rolling files;glean daemonuses info for rolling files and warn on stderr. Thegleanbinary does not readRUST_LOG.- Runtime TOML (optional): merged from
$GLEAN_STORAGE_ROOT/config.tomlonly (defaults + global).[indexing].watch_intervalis in seconds;0= daemon initial sync only. Nested Git repositories under a workspace are discovered but not auto-indexed (use Desktop Start indexing or appendmetadata/index_requests.jsonl). See.docs/04-Ops-Security/local-storage-model.mdfor multi-workspace index layout.glean config list/init/setoperate on global config;glean models pull rerankpre-downloads the BGE cache under global storage.
Rolling logs also land under {GLEAN_STORAGE_ROOT}/logs/ (cli.yyyy-mm-dd / daemon.yyyy-mm-dd). Do not print diagnostics to stdout while running glean mcp. For quick inspection from a terminal, run glean logs (-n line count, --source cli|daemon|all); it does not install the tracing subscriber.
Chunks are embedded with FastEmbed using AllMiniLM-L6-v2 (384-dimensional Float32 vectors) and stored in LanceDB document_chunks (see .docs/02-Developer-Guide/lancedb-schema.md). The ONNX artifacts download on first use into $GLEAN_STORAGE_ROOT/cache/embedding/ (not the process working directory — required for packaged desktop apps).
If you upgrade Glean and see LanceDB schema mismatch, stop running processes, delete <workspace>/.glean/vectors (or the whole .glean folder), then run glean daemon again to reindex that workspace.
Manual index: glean index enqueues a walk when the daemon is already running, or runs a one-shot drain when it is not (--wait polls until idle).
chmod +x scripts/verify_rust.sh
./scripts/verify_rust.shThis repo may ignore .cursor/ for open-source hygiene. If you want automatic verification after an Agent completes a turn, configure a user-level Cursor hook (for example on stop) to run scripts/verify_rust.sh.
Treat .github/workflows/rust.yml as the shared PR / main CI gate (fmt, clippy, tests, excludes glean-desktop on Linux). Desktop releases: local pnpm tag then push tag → release-desktop.yml; see .docs/04-Ops-Security/desktop-release.md.
- Optional Cursor Hooks (e.g. on
stop) pointing atscripts/verify_rust.share a local productivity aid. They are not required for correctness. rust.yml: PR + non-release pushes tomain;--exclude glean-desktop(faster than full workspace).scripts/verify_rust.sh: local full workspace including desktop/Tauri sidecar prep.- Releases: download
.dmg, Windows NSIS setup.exe, andglean-*CLI binaries from GitHub Releases — not the auto-generated Source code zip only.
Licensed under the Apache License, Version 2.0. See LICENSE.