Skip to content

feat: semantic search — pgvector vector(1024) + HNSW, Ollama default#3

Merged
jp-cruz merged 1 commit into
mainfrom
feat/pgvector-semantic-search
Jul 2, 2026
Merged

feat: semantic search — pgvector vector(1024) + HNSW, Ollama default#3
jp-cruz merged 1 commit into
mainfrom
feat/pgvector-semantic-search

Conversation

@jp-cruz

@jp-cruz jp-cruz commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Implements the 1024 index standard decided today (arctic-embed2 native / Qwen3 MRL ceiling / OpenAI-truncatable; model churn = re-embed job, never a schema migration).

  • alembic 004: embedding JSONB → vector(1024) + HNSW cosine index (empty-table guarded)
  • search_memory mode semantic (now the tool-schema default): cosine ranking, per-hit distance
  • Ollama provider implemented (/api/embed) and promoted to default — local-first; default model snowflake-arctic-embed2; dims map + explicit override for unknown models
  • OpenAI truncates to 1024 via the dimensions param (matryoshka) — no more 1536 anywhere
  • Write-path dimension guard — the index can never silently mix dimensions (nomic/768 now refused with a pointed error)

Live-verified against pgvector/pg17: all 4 migrations, semantic ranking correctness (deterministic stub vectors: exact match at distance ≈0, ordered), full prior integration suite. 131 unit + 5 live integration tests.

Deploy note: after merge, ollama pull snowflake-arctic-embed2 on the Mini (tags list is currently empty) and run alembic upgrade head before first write.

🤖 Generated with Claude Code

…provider

Index standard is 1024 by design: arctic-embed2 native, Qwen3-Embedding
MRL ceiling, OpenAI truncatable via the dimensions param. Model swaps
are re-embedding jobs (provenance columns exist for that), never schema
migrations. 1536 is deliberately rejected — it is OpenAI-shaped, no
local model emits it.

- alembic 004: embedding JSONB -> vector(1024) NOT NULL + HNSW cosine
  index (guarded: refuses on non-empty table)
- search_memory mode=semantic: query embedding + '<=>' cosine ranking,
  per-hit distance in results; fts mode unchanged
- OllamaProvider implemented (/api/embed) and made the DEFAULT provider
  (local-first); default model snowflake-arctic-embed2; built-in dims
  map + SCOPED_MCP_EMBEDDING_DIMENSIONS override for unknown models
- OpenAIProvider truncates to 1024 (matryoshka) instead of native 1536
- write-path guard: non-1024 embeddings refused — the index can never
  silently mix dimensions
- live-verified on pgvector/pg17: migrations 001->004, semantic ranking
  (distance ~0 for exact-match stub vector, ordered correctly), plus
  the full prior integration suite. 131 unit + 5 live tests.

Co-Authored-By: Claude Fable 5 <[email protected]>
@jp-cruz jp-cruz merged commit 8ee4460 into main Jul 2, 2026
11 checks passed
@jp-cruz jp-cruz deleted the feat/pgvector-semantic-search branch July 2, 2026 22:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant