Skip to content

[Backend] Make model initialization lazy and fault-tolerant to improve API startup reliability #302

@j4b3-21

Description

@j4b3-21

What feature do you want to see added?

Today, the backend initializes heavy AI components during import/startup (embedding model + LLM provider). This makes the startup slow and fragile, especially in local development, CI, and environments where model files or optional dependencies are missing.
I’d like to add lazy, fault-tolerant model initialization so the API can start reliably and only initialize heavy models when they are actually needed.

Why this helps

  • Prevents startup failures caused by missing model artifacts/dependencies
  • Improves dev-lite and test experience (fewer import-time side effects)
  • Reduces cold-start pain and makes behavior more predictable
  • Supports graceful degraded mode when LLM/retrieval is unavailable

Proposed behavior

  • Replace eager globals (e.g., embedding/LLM singletons created at import time) with cached lazy getters
  • Initialize models on first use with thread-safe guards
  • Return graceful fallback responses when model initialization fails
  • Expand health/readiness info to show model availability state explicitly
  • Add tests for lazy init, failure paths, and “no heavy load at import time”

Acceptance criteria

  • Importing service modules does not trigger heavy model loading/downloading
  • API starts even if model files are missing
  • Chat endpoints degrade gracefully when LLM is unavailable
  • Retrieval errors are handled without crashing the process
  • The health endpoint clearly reflects the readiness of major model components

Upstream changes

No known upstream dependency yet.
If there are related Jenkins plugin or API reliability discussions/PRs, I can link and align this issue to avoid duplicate work.

Are you interested in contributing this feature?

Yes — I’m interested in contributing this.

I can start with:

  1. a small design proposal (lazy getter + cache + error policy),
  2. a focused PR for backend wiring changes, and
  3. follow-up tests for startup/import behavior and degraded-mode responses.

If maintainers have preferences regarding module structure or the health endpoint schema, I’m happy to align on them before opening the PR.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels
    No fields configured for Enhancement.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions