Corrective Adaptive Retrieval Architecture(CARA), a self-correcting RAG system implementing CRAG + Self-RAG patterns. Grades retrieved documents, rewrites failed queries, detects hallucinations, and streams the full reasoning trace to a real-time audit UI. Built with LangGraph, FastAPI, pgvector, and Next.js.
apps/
api/ FastAPI backend (Python)
graph/ LangGraph pipeline nodes + routing
ingestion/ PDF/txt/md → chunk → embed → pgvector
database/ SQLAlchemy models + Alembic migrations
web/ Next.js 16 frontend
app/ Chat UI + analytics dashboard
components/ AuditTrail, ChatMessage, UploadButton
hooks/ useRAGStream (SSE state management)
question
→ retrieve (pgvector cosine search → Cohere rerank)
→ grade documents (Gemini: relevant / irrelevant per doc)
→ [no relevant docs?] rewrite query → retrieve again (max 2x)
→ generate answer (Gemini, relevant docs only)
→ hallucination check (Gemini auditor)
→ [hallucination?] regenerate (max 2x)
→ stream final answer via SSE
Every step is logged to audit_logs for full traceability.
- Python 3.12+
- Node.js 20+
- PostgreSQL 16 with the
pgvectorextension - Google AI API key (Gemini 2.5 Flash)
- Cohere API key (reranker)
docker compose up -d postgresThis runs pgvector/pgvector:pg16 on port 5433 (avoids conflicts with local Postgres).
cd apps/api
cp .env.example .env # if it exists, otherwise create:DATABASE_URL=postgresql://postgres:postgres@localhost:5433/cara
GOOGLE_API_KEY=your_key_here
COHERE_API_KEY=your_key_herecd apps/api
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
pip install -r requirements.txtcd apps/api
venv/bin/alembic upgrade head# Ingest the sample papers included in the repo
venv/bin/python scripts/ingest_docs.py --dir ../../sample_docs
# Or ingest your own files
venv/bin/python scripts/ingest_docs.py --file paper.pdf
venv/bin/python scripts/ingest_docs.py --dir /path/to/docsSupported formats: .pdf, .txt, .md
First run downloads the embedding model (~420 MB, cached after).
cd apps/api
venv/bin/uvicorn main:app --reload --port 8000Verify: curl http://localhost:8000/health
# From repo root
npm install
npm run dev:webOpen http://localhost:3000.
- Type a question and press Enter to send
- Left panel shows the live reasoning trace (retrieve → grade → generate → hallucinate check)
- Each assistant message shows confidence score, source files, and a "View reasoning" link
- Use Upload doc in the header to add new documents without restarting the server
- Summary stats: total queries, avg confidence, hallucination rate, avg rewrites
- Query volume chart (last 30 days)
- Confidence score distribution (calibration view)
- Rewrite distribution pie chart (retrieval quality)
- Top source documents by retrieval count
- Recent query log with per-row confidence badges
cd apps/api
venv/bin/python -m pytest tests/test_graph.py -vAll LLM calls and DB are mocked — no API keys or database needed.
# From repo root
npm run dev:web # start Next.js dev server
npm run dev:api # start FastAPI with reload| Variable | Default | Effect |
|---|---|---|
TOP_K_RETRIEVE |
10 | Chunks fetched from pgvector before reranking |
TOP_K_RERANK |
4 | Chunks kept after Cohere rerank |
MAX_RETRIES |
2 | Max query rewrites if no relevant docs found |
MAX_REGENERATIONS |
2 | Max regenerations if hallucination detected |
MIN_RELEVANT_DOCS |
1 | Minimum relevant docs required to attempt generation |
EMBEDDING_MODEL |
multi-qa-mpnet-base-dot-v1 |
Must match Vector(768) in schema |
| Method | Path | Description |
|---|---|---|
GET |
/health |
Liveness check |
POST |
/ask |
SSE streaming RAG query |
POST |
/upload |
Upload and ingest a document |
GET |
/analytics |
Aggregated dashboard data |
GET |
/sessions/{id}/queries |
Query history for a session |
GET |
/queries/{id}/audit |
Full audit trail for a query |
start → session_id, query_id assigned
retrieve → N documents fetched
grade → relevance grades per document
rewrite → query rewritten (only if no relevant docs)
generate → answer produced
hallucination_check → grounding verified
final → { answer, confidence, sources, ... }
error → { message }