Chunk long notes before embedding

## Problem
`sync_notes()` and the ingest pipeline add whole documents to ChromaDB
(`collection.add(documents=[content], ...)`). Long notes become a single oversized
vector, diluting semantic precision and hurting retrieval.

## Proposal
Split note content into overlapping chunks before embedding, so retrieval returns
focused passages instead of whole files.

## Tasks
- [ ] Add a chunker (e.g. ~500–800 tokens, ~10–15% overlap; split on headings/paragraphs first)
- [ ] Apply it in `src/database/vector_db.py` (sync) and `src/pipeline/orchestrator.py` (ingest)
- [ ] Store `source` + chunk index in metadata so sources still map back to the origin note
- [ ] Update `sync_notes()` delete/update logic to handle multiple chunk-ids per file
- [ ] Make chunk size / overlap configurable via `.env`

## Acceptance
A long note is indexed as multiple chunks; a query returns the relevant passage, and sources still resolve to the correct file.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chunk long notes before embedding #2

Problem

Proposal

Tasks

Acceptance

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Chunk long notes before embedding #2

Description

Problem

Proposal

Tasks

Acceptance

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions