What feature do you want to see added?
Today, the backend initializes heavy AI components during import/startup (embedding model + LLM provider). This makes the startup slow and fragile, especially in local development, CI, and environments where model files or optional dependencies are missing.
I’d like to add lazy, fault-tolerant model initialization so the API can start reliably and only initialize heavy models when they are actually needed.
Why this helps
- Prevents startup failures caused by missing model artifacts/dependencies
- Improves dev-lite and test experience (fewer import-time side effects)
- Reduces cold-start pain and makes behavior more predictable
- Supports graceful degraded mode when LLM/retrieval is unavailable
Proposed behavior
- Replace eager globals (e.g., embedding/LLM singletons created at import time) with cached lazy getters
- Initialize models on first use with thread-safe guards
- Return graceful fallback responses when model initialization fails
- Expand health/readiness info to show model availability state explicitly
- Add tests for lazy init, failure paths, and “no heavy load at import time”
Acceptance criteria
- Importing service modules does not trigger heavy model loading/downloading
- API starts even if model files are missing
- Chat endpoints degrade gracefully when LLM is unavailable
- Retrieval errors are handled without crashing the process
- The health endpoint clearly reflects the readiness of major model components
Upstream changes
No known upstream dependency yet.
If there are related Jenkins plugin or API reliability discussions/PRs, I can link and align this issue to avoid duplicate work.
Are you interested in contributing this feature?
Yes — I’m interested in contributing this.
I can start with:
- a small design proposal (lazy getter + cache + error policy),
- a focused PR for backend wiring changes, and
- follow-up tests for startup/import behavior and degraded-mode responses.
If maintainers have preferences regarding module structure or the health endpoint schema, I’m happy to align on them before opening the PR.
What feature do you want to see added?
Today, the backend initializes heavy AI components during import/startup (embedding model + LLM provider). This makes the startup slow and fragile, especially in local development, CI, and environments where model files or optional dependencies are missing.
I’d like to add lazy, fault-tolerant model initialization so the API can start reliably and only initialize heavy models when they are actually needed.
Why this helps
Proposed behavior
Acceptance criteria
Upstream changes
No known upstream dependency yet.
If there are related Jenkins plugin or API reliability discussions/PRs, I can link and align this issue to avoid duplicate work.
Are you interested in contributing this feature?
Yes — I’m interested in contributing this.
I can start with:
If maintainers have preferences regarding module structure or the health endpoint schema, I’m happy to align on them before opening the PR.