[Backend] Make model initialization lazy and fault-tolerant to improve API startup reliability

### What feature do you want to see added?

Today, the backend initializes heavy AI components during import/startup (embedding model + LLM provider). This makes the startup slow and fragile, especially in local development, CI, and environments where model files or optional dependencies are missing.
I’d like to add lazy, fault-tolerant model initialization so the API can start reliably and only initialize heavy models when they are actually needed.

Why this helps

- Prevents startup failures caused by missing model artifacts/dependencies
- Improves dev-lite and test experience (fewer import-time side effects)
- Reduces cold-start pain and makes behavior more predictable
- Supports graceful degraded mode when LLM/retrieval is unavailable

Proposed behavior

- Replace eager globals (e.g., embedding/LLM singletons created at import time) with cached lazy getters
- Initialize models on first use with thread-safe guards
- Return graceful fallback responses when model initialization fails
- Expand health/readiness info to show model availability state explicitly
- Add tests for lazy init, failure paths, and “no heavy load at import time”

Acceptance criteria

- Importing service modules does not trigger heavy model loading/downloading
- API starts even if model files are missing
- Chat endpoints degrade gracefully when LLM is unavailable
- Retrieval errors are handled without crashing the process
- The health endpoint clearly reflects the readiness of major model components

### Upstream changes

No known upstream dependency yet.
If there are related Jenkins plugin or API reliability discussions/PRs, I can link and align this issue to avoid duplicate work.

### Are you interested in contributing this feature?

Yes — I’m interested in contributing this.

I can start with:

1. a small design proposal (lazy getter + cache + error policy),
2. a focused PR for backend wiring changes, and
3. follow-up tests for startup/import behavior and degraded-mode responses.

If maintainers have preferences regarding module structure or the health endpoint schema, I’m happy to align on them before opening the PR.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Backend] Make model initialization lazy and fault-tolerant to improve API startup reliability #302

What feature do you want to see added?

Upstream changes

Are you interested in contributing this feature?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[Backend] Make model initialization lazy and fault-tolerant to improve API startup reliability #302

Description

What feature do you want to see added?

Upstream changes

Are you interested in contributing this feature?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions