Skip to content

feat: auto-provision personal LiteLLM key via Askii on first SSO login#27

Open
hunzlahmalik wants to merge 5 commits into
foss-sandboxfrom
feat/askii-auto-provision-litellm-key
Open

feat: auto-provision personal LiteLLM key via Askii on first SSO login#27
hunzlahmalik wants to merge 5 commits into
foss-sandboxfrom
feat/askii-auto-provision-litellm-key

Conversation

@hunzlahmalik
Copy link
Copy Markdown

@hunzlahmalik hunzlahmalik commented May 22, 2026

Ticket: FOSSSMBBUN-98

Summary

When a Moneta user logs into the FOSS SurfSense instance for the first time via mPass, the backend now calls api.askii.ai to issue a personal LiteLLM key titled "FOSS Server" and persists four config rows on the user's default search space — agent (NewLLMConfig), document-summary (NewLLMConfig), image-gen (ImageGenerationConfig), and vision (VisionLLMConfig). All four SearchSpace FKs are wired up in the same transaction, so chat, summaries, image generation, and vision-aided document analysis all work without manual key configuration.

Implements the user story from the ticket: "As a SurfSense user setting up My Space, I want my personal LLM API key to be provisioned automatically so I can start using AI features without manually configuring API credentials."

Behavior

  • Gated by AUTH_TYPE=SSO AND AUTO_PROVISION_LITELLM_KEY=true. Both must be set; flag is off by default, so this PR is inert until an operator explicitly enables it (see the foss-server-bundle PR for env wiring).
  • Models come from env vars: ASKII_AGENT_MODEL, ASKII_DOCUMENT_SUMMARY_MODEL (blank ⇒ inherits agent), ASKII_IMAGE_GEN_MODEL, ASKII_VISION_MODEL. The three required vars must be non-empty for the gate to open.
  • Idempotent at two layers: a DB existence check on the "FOSS Server - Agent" marker row (AC3 — no duplicate provisioning on return logins), and the lazy-retry guard on GET /searchspaces/{id} (AC4 — "one more attempt when the user lands on My Space").
  • Fail-closed on errors: any AskiiAuthError / AskiiServerError / AskiiTransportError is caught and logged; user registration and search-space creation always succeed.

Files

Layer File Change
Dep pyproject.toml, uv.lock Pin askii @ git+https://github.com/Pressingly/[email protected]
Config app/config/__init__.py 8 new env vars (AUTO_PROVISION_LITELLM_KEY, ASKII_*)
Service app/services/litellm_provisioning.py NEW — should_auto_provision, ensure_personal_litellm_keys, _provision_via_askii + error classification
Hook (login) app/users.py Best-effort call in UserManager.on_after_register
Hook (lazy retry) app/routes/search_spaces_routes.py Owner-only guard in read_search_space
Docs .env.example New "Auto-provision a personal LiteLLM key" block (opt-in)
Tests tests/unit/test_litellm_provisioning.py NEW — 19 cases

Test plan

  • 24 unit tests pass (19 new + 5 pre-existing related): gate variants, missing token header, marker-row idempotency, 4-row happy path with FK assertions, doc-summary fallback, LITELLM_BASE_URL fallback, SDK error matrix (401 / 422 / 500).
  • Sandbox validation of the underlying askii-python v0.1.0 SDK against api.sandbox.askii.ai (separate report in the askii-python release notes).
  • ruff check + ruff format --check clean across all touched files.
  • Manual E2E in devstack (gated on the foss-server-bundle env-wiring PR landing).

Related

  • Depends on askii-python v0.1.0.
  • Companion env wiring in foss-server-bundle: opens immediately after this lands.

🤖 Generated with Claude Code

…st SSO login

Gated by AUTH_TYPE=SSO and AUTO_PROVISION_LITELLM_KEY=true. On first
login, calls Askii's platform/provision-key endpoint with the user's
mPass JWT scoped to the agent / doc-summary / image-gen / vision models,
then inserts the matching NewLLMConfig / ImageGenerationConfig /
VisionLLMConfig rows and wires up all four SearchSpace FKs so chat,
summaries, image gen and vision work out of the box.

If provisioning fails at login (network blip, transient 5xx), the GET
/searchspaces/{id} handler retries once — the AC4 "one more attempt
when the user lands on My Space" guarantee — and remains a single
SELECT on the steady-state path once the marker row exists.

Pins the askii v0.1.0 SDK from github.com/Pressingly/askii-python and
adds 8 env vars to AskiiConfig.from_env (AUTO_PROVISION_LITELLM_KEY,
ASKII_BASE_URL, ASKII_LITELLM_BASE_URL, ASKII_AGENT_MODEL,
ASKII_DOCUMENT_SUMMARY_MODEL, ASKII_IMAGE_GEN_MODEL, ASKII_VISION_MODEL,
ASKII_LITELLM_KEY_DURATION_DAYS).

19 new unit tests cover: gate variants (AUTH_TYPE / flag / each model
env), missing access-token header, marker-row idempotency, the 4-row
happy path with FK assertions, doc-summary fallback to agent model,
LITELLM_BASE_URL fallback to ASKII_BASE_URL, and the SDK error matrix
(401 / 422 / 500). All 24 related tests pass; ruff + format clean.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
@hunzlahmalik
Copy link
Copy Markdown
Author

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an opt-in backend flow to auto-provision a per-user LiteLLM API key via Askii on first SSO login (with a lazy retry on “My Space” load), and persists four related config rows on the user’s default search space so AI features work without manual key setup.

Changes:

  • Introduces app/services/litellm_provisioning.py to gate provisioning, call Askii, insert 4 config rows, and wire SearchSpace FK fields.
  • Hooks provisioning into UserManager.on_after_register and adds an owner-only lazy retry in GET /searchspaces/{id}.
  • Adds Askii dependency + new env/config vars and unit test coverage for gating, idempotency, and error handling.

Reviewed changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
surfsense_backend/app/services/litellm_provisioning.py New provisioning service (gate + Askii call + DB inserts/FK wiring).
surfsense_backend/app/users.py Best-effort provisioning attempt after default search space creation on register.
surfsense_backend/app/routes/search_spaces_routes.py Owner-only lazy retry on reading a search space when auto-provisioning is enabled.
surfsense_backend/app/config/__init__.py Adds env-backed config knobs for auto-provisioning and Askii model/base URL settings.
surfsense_backend/.env.example Documents the new opt-in env vars for provisioning.
surfsense_backend/pyproject.toml Adds askii dependency (git-pinned).
surfsense_backend/tests/unit/test_litellm_provisioning.py Unit tests covering gate behavior, idempotency, happy path, and error matrix.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread surfsense_backend/app/services/litellm_provisioning.py
Comment thread surfsense_backend/app/services/litellm_provisioning.py Outdated
Comment thread surfsense_backend/app/services/litellm_provisioning.py Outdated
Comment thread surfsense_backend/app/config/__init__.py
Comment thread surfsense_backend/tests/unit/test_litellm_provisioning.py Outdated
Comment thread surfsense_backend/pyproject.toml Outdated
Five Copilot comments triaged, four fixed in this commit; the fifth was
a docstring-only "fail closed" wording inconsistency on ASKII_BASE_URL,
also addressed here.

1. _provision_via_askii now constructs an explicit
   AskiiConfig.from_env(base_url=config.ASKII_BASE_URL) and passes it
   to AsyncAskii. Production previously worked by coincidence (the SDK
   also reads ASKII_BASE_URL from os.environ via from_env), but the
   coupling was implicit; this makes the SurfSense config the single
   source of truth for the outbound Askii endpoint and survives any
   future refactor that moves SurfSense's config off env vars.

2. The four-row DB-write block (session.add x4 / flush / FK mutation /
   commit) is now wrapped in try/except SQLAlchemyError → rollback +
   log + return False, matching the documented best-effort contract.
   Previously a flush/commit failure would propagate up and leave the
   request's session in a failed-transaction state. Added a regression
   test that monkeypatches session.flush to raise SQLAlchemyError.

3. ensure_personal_litellm_keys now takes a SELECT … FOR UPDATE on the
   SearchSpace row before the existence check, so two concurrent
   provisioning attempts (e.g. the same user opening two tabs during
   their first My Space load) serialize. The second waiter sees the
   marker row committed by the first and short-circuits via the
   existing idempotency check, avoiding duplicate upstream Askii keys
   and duplicate config rows.

4. Module + should_auto_provision docstrings rewritten to drop the
   "ASKII_BASE_URL must be set so a half-configured deploy fails closed"
   framing — the base-URL clause is effectively always satisfied since
   the default is non-empty (prod). The real fail-closed gate is the
   feature flag (default FALSE) plus the three required model env vars
   (default empty); the wording now reflects that.

5. test_returns_false_when_auth_type_not_sso no longer leaks its
   httpx.AsyncClient — extracted to a `client = …; try: …; finally:
   await client.aclose()` block matching the surrounding pattern.

Deliberately out of scope (fast-follow if needed):
- Alembic migration adding UNIQUE (user_id, search_space_id, name) on
  the three config tables. The row lock closes the same-host race
  cleanly; the constraint is the stronger long-term backstop and is
  worth adding only if duplicates surface in prod despite the lock.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 2 comments.

Comment thread surfsense_backend/app/services/litellm_provisioning.py Outdated
Comment thread surfsense_backend/pyproject.toml Outdated
Two follow-up Copilot comments on the prior fix commit:

1. ensure_personal_litellm_keys took a SELECT … FOR UPDATE on the
   SearchSpace row unconditionally — but the lazy guard runs on every
   owner GET /searchspaces/{id}, so the steady state (already
   provisioned) is the hot path and would acquire/release the lock for
   nothing on every page load. Restructured to double-checked locking:
   cheap SELECT first, return True immediately if the marker row
   exists; only when missing do we take the row lock and re-SELECT
   inside the lock to catch the race window. Provisioned users now
   incur zero lock contention.

2. The askii dependency in pyproject.toml was pinned to git tag
   `v0.1.0`. Git tags can be force-moved, so a uv lock re-resolve
   after a tag rewrite would silently change the installed code.
   Pinned to the underlying commit SHA
   eb7591d558d309f7d53161187d69df718b584751 (immutable). Tag name
   preserved in an adjacent comment for readers.

uv.lock refreshed to reflect the new rev string for askii. The
unrelated marker simplifications in the lock are uv resolver
normalization (equivalent dependency graph, simpler boolean conditions
on cuda-bindings / nvidia-* / contourpy markers).

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 8 changed files in this pull request and generated 3 comments.

Comment thread surfsense_backend/app/services/litellm_provisioning.py Outdated
Comment thread surfsense_backend/app/services/litellm_provisioning.py Outdated
Comment thread surfsense_backend/.env.example Outdated
hunzlahmalik and others added 2 commits May 22, 2026 18:04
Co-authored-by: Copilot Autofix powered by AI <[email protected]>
…und 3)

Two functional changes and one new test class, addressing the three
new Copilot comments on the prior round-2 commit:

A. ensure_personal_litellm_keys now wraps its full body in a top-level
   try/except Exception. The cheap pre-lock SELECT, the SELECT ... FOR
   UPDATE itself, AskiiConfig.from_env(), and any future call that
   leaks an unclassified exception are all caught here; we rollback to
   release any lock, log the exception, and return False — restoring
   the "never raise" docstring guarantee. Both callers (users.py /
   search_spaces_routes.py) already wrap us with try/except Exception,
   but the function should hold up its own contract.

B. The row-level lock no longer wraps the entire flow. The previous
   shape took SELECT ... FOR UPDATE on the SearchSpace row before the
   token check and the outbound Askii call, holding the lock across a
   network request that can run 5–30s on a slow upstream and stalling
   every concurrent tab on the same row. The lock now wraps only the
   DB-write window:

       cheap marker SELECT          (no lock, hot path)
       token check                  (no lock)
       Askii provisioning call      (no lock)
       build the 4 row objects
       SELECT … FOR UPDATE
       re-SELECT marker inside lock
         race-loss → rollback + return True (Askii key orphaned)
         else → write rows → flush → set FKs → commit

   The trade-off is that on a rare race-loss (two tabs hit My Space at
   the exact same instant on first login), both workers call Askii
   and one upstream key ends up orphaned — auto-expires at the
   configured TTL (default 90 days). Lock contention exposure shrinks
   from "up to the Askii timeout" to "sub-ms DB write window".

C. (no change) surfsense_backend/.env.example was already cleaned up
   in an earlier commit — comments are on dedicated lines, no inline
   `# …` after `=`. The bot's anchor was against the original PR diff.

Tests: 22/22 green. Two new regression tests:

- test_race_loss_inside_lock_returns_true_without_writes — drives the
  cheap SELECT to None and the re-check inside the lock to a found
  marker, asserts True is returned, no rows added, rollback called.
- test_unexpected_exception_caught_and_rolled_back — drives the cheap
  SELECT to raise RuntimeError, asserts outer except catches it,
  rollback called, returns False.

Co-Authored-By: Claude Opus 4.7 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants