feat(language-model): add OAuthLanguageModel for claude/codex/gemini OAuth subprocess auth#52
feat(language-model): add OAuthLanguageModel for claude/codex/gemini OAuth subprocess auth#52lelouar wants to merge 2 commits into
Conversation
|
Ok for the concept, I agree that it would be a nice addition, but the implementation is breaking plenty of style rules, first the model parameter should be "provider/model", then why using an additional method to connect, can't that be done already at the class construction. remove get_oauth_cli_language_model, rename OAuthLanguageModel and keep the parameters consistent with LanguageModel so its an in-place replacement |
a6a8698 to
12db86b
Compare
|
Thanks @lelouar — fully agree on all four points. Force-pushed an updated commit ( Changes vs the original commit
Other adjustments
Diffstat shrunk to +738 / 0 across 6 files. Let me know if there's anything else to tighten. |
Self-review: two regressions to fix before mergeWhile re-reading the PR against the previous internal version of the same 1. gemini hangs on stdin-piped prompt for
|
Two regressions identified during self-review of PR SynaLinks#52, both restored from the previous internal version of this adapter: 1. `gemini-2.5-flash` hangs on stdin-piped prompts. Route the prompt through `-p <prompt>` (was `-p ""`) and close stdin with `DEVNULL` for the gemini branch only — codex/claude still read from stdin. 2. Gemini CLI v0.37+ refuses to run in an "untrusted" workspace without an interactive trust prompt. Subprocess invocations are non-interactive by definition, so set `GEMINI_CLI_TRUST_WORKSPACE=true` explicitly. Adds a regression test asserting the gemini command builder emits `-p <prompt>` (not `-p ""`) when a prompt is provided. codex/claude paths are untouched.
|
Pushed 1. gemini stdin hang — 2. workspace trust — Test coverage — added |
Two regressions identified during self-review of PR SynaLinks#52, both restored from the previous internal version of this adapter: 1. `gemini-2.5-flash` hangs on stdin-piped prompts. Route the prompt through `-p <prompt>` (was `-p ""`) and close stdin with `DEVNULL` for the gemini branch only — codex/claude still read from stdin. 2. Gemini CLI v0.37+ refuses to run in an "untrusted" workspace without an interactive trust prompt. Subprocess invocations are non-interactive by definition, so set `GEMINI_CLI_TRUST_WORKSPACE=true` explicitly. Adds a regression test asserting the gemini command builder emits `-p <prompt>` (not `-p ""`) when a prompt is provided. codex/claude paths are untouched.
d97bc9d to
c90349f
Compare
|
Rebased onto latest Changes during rebase:
All 7 tests pass. Mergeable ✅ |
|
How structured output is handled when using the OAuthLanguageModel ? |
|
Structured output follows a per-provider strategy because only codex exposes a server-side schema endpoint over OAuth: codex — server-side enforcement via claude / gemini — prompt-based extraction The raw stdout is then passed through When no schema is provided, all three providers return a plain-text The tradeoff is intentional: codex is the only OAuth-accessible model that supports structured output reliably server-side. For claude/gemini the prompt-injection approach is best-effort — it works well for well-prompted schemas but has no API-level guarantee. An explicit note could be added to the docstring if you'd like. |
…OAuth subprocess auth
In-place subclass of `LanguageModel` that bridges to the locally-installed
`claude`, `codex`, and `gemini` CLIs as a subprocess. Unlocks an auth
modality the framework cannot reach today: OAuth-only subscriptions
(Claude Max, ChatGPT Plus/Pro, Google account) where the user has no
usable API key.
Constructor signature is identical to `LanguageModel` (model, api_base,
timeout, retry, fallback, caching). Provider is encoded in the `model`
string as `provider/model`, e.g.
synalinks.OAuthLanguageModel(model="codex/gpt-5.2")
synalinks.OAuthLanguageModel(model="claude/claude-sonnet-4-6")
synalinks.OAuthLanguageModel(model="gemini/gemini-2.0-flash")
Only `claude`, `codex`, `gemini` are accepted as providers (any other
prefix raises ValueError). `reasoning_effort` is consumed at call time
as the same kwarg the parent already accepts, mapped to codex
`model_reasoning_effort` and claude `--effort`. `streaming=True` raises.
Latency-critical decisions (retained from the prior implementation):
* codex passes the JSON schema via `--output-schema` so the Responses
API enforces strict mode server-side (zero parse retries).
* claude does NOT use `--bare` (requires API key) nor `--json-schema`
(measurably slower); fall back to prompt-based JSON extraction.
* gemini runs against an isolated minimal HOME at
`~/.synalinks_gemini_home` to bypass user hooks/skills/agents
(~63s -> 8-16s on cold start). Override via `SYNALINKS_GEMINI_HOME`.
Cost reporting is `0.0` (subscriptions are flat-rate; no fake numbers).
Additive only: no changes to base `LanguageModel`, no new dependencies
(stdlib only), 11 tests pass without claude/codex/gemini binaries
installed.
Files:
* new: synalinks/src/language_models/oauth_language_model.py
* new: synalinks/src/language_models/oauth_language_model_test.py
* mod: synalinks/src/language_models/__init__.py (registers in
ALL_OBJECTS for serialize/deserialize round-trip)
* mod: synalinks/api/__init__.py + synalinks/api/language_models/__init__.py
+ synalinks/__init__.py (regenerated by shell/api_gen.sh)
Two regressions identified during self-review of PR SynaLinks#52, both restored from the previous internal version of this adapter: 1. `gemini-2.5-flash` hangs on stdin-piped prompts. Route the prompt through `-p <prompt>` (was `-p ""`) and close stdin with `DEVNULL` for the gemini branch only — codex/claude still read from stdin. 2. Gemini CLI v0.37+ refuses to run in an "untrusted" workspace without an interactive trust prompt. Subprocess invocations are non-interactive by definition, so set `GEMINI_CLI_TRUST_WORKSPACE=true` explicitly. Adds a regression test asserting the gemini command builder emits `-p <prompt>` (not `-p ""`) when a prompt is provided. codex/claude paths are untouched.
c90349f to
7296041
Compare
TL;DR
Adds
synalinks.OAuthCLILanguageModel— a drop-inLanguageModelsubclass that bridges to the locally-installedclaude,codex, andgeminiinteractive CLIs as a subprocess. This unlocks an auth modality the framework cannot currently reach: OAuth-only subscriptions (Claude Max, ChatGPT Plus/Pro, Google account) where the user has no usable API key and therefore no path to litellm.language_models/__init__.pyto register the class for serializationThe base
LanguageModelis not modified.Motivation — why this belongs in synalinks, not in user code
Today, every flagship hosted model in 2026 ships through OAuth subscription as the dominant or sole distribution channel for individual developers:
A user on a flat-rate subscription cannot today plug their preferred model into a synalinks
Generator,ChainOfThought, orFunctionCallingAgent, because litellm requires an API key. They have to either:The interactive CLIs (
claude,codex,gemini) already authenticate locally via OAuth and already accept structured input/output flags. Wrapping them as aLanguageModelis a generic, framework-level concern: the subclass is the same regardless of who uses it, and the non-obvious bits (gemini HOME isolation, codex--output-schemastrict-mode) are the kind of hard-won knowledge that belongs in the framework rather than rediscovered downstream.What this adds
Public surface:
synalinks.OAuthCLILanguageModel(provider, model, timeout=180, retry=2, reasoning=\"medium\", effort=\"low\", caching=True, fallback=None)synalinks.language_models.get_oauth_cli_language_model()— env-driven factorySYNALINKS_CLI_{PROVIDER,MODEL,TIMEOUT,REASONING,EFFORT},SYNALINKS_GEMINI_HOMEInternal helpers (kept private, exported only for testing):
ask_llm_via_cli(provider, prompt, …)— the low-level async call_make_strict_schema(schema)— codex strict-mode schema rewrite_extract_json(text)— JSON extraction fallback for claude/geminiensure_minimal_gemini_home()— idempotent isolated HOME setupArchitecture
The class is a thin subclass of
LanguageModel:__init__callssuper().__init__(model=f\"oauth/{provider}/{cli_model or 'default'}\", …)so the canonicalmodelstring round-trips through the existing serialization machinery and theoauth/prefix is unambiguous.__call__is overridden — it does not go through litellm. It formats the messages into a single prompt, dispatches to the right per-provider command builder, awaitsasyncio.subprocess, and returns either the parsed JSON dict (whenschemais given) or the standard{role, content, tool_call_id, tool_calls, created_at, usage}shape produced by the parent class.streaming=Trueraises explicitly — subprocess stdout buffering would make token-by-token streaming brittle and inconsistent with the litellm path.get_config/from_configround-trip every field includingfallback.The subprocess environment is built by stripping the API-key vars (
ANTHROPIC_API_KEY,OPENAI_API_KEY,GOOGLE_API_KEY,GEMINI_API_KEY,XAI_API_KEY) before exec — without this, the CLIs auto-detect them and silently bypass OAuth, which on a subscription-only account either fails loudly (wrong auth) or silently routes to the wrong workspace.Per-provider design notes (latency-critical)
These flag combinations were measured empirically. Each one is in there for a reason; please don't simplify them away without benchmarking.
codex (
gpt-5.2/gpt-5.3-codex/gpt-5.4)--ephemeralskips conversation persistence.--skip-git-repo-checksaves a parent-dir scan we never need.personality=\"none\"disables system-prompt injection.--output-schema FILEis the big one: when a JSON schema is provided, codex forwards it to the OpenAI Responses API withstrict=true, so the server enforces the schema and we never have to retry on parse failures. Empirically this is faster than asking for JSON in the prompt.reasoning=lowis the floor —minimalis rejected by the Responses API wheneverweb_searchis attached, which it always is for ChatGPT accounts.The strict-mode requirement of the Responses API is non-trivial: every
objectnode must declareadditionalProperties: falseand every property key must appear inrequired. Synalinks-emitted schemas frequently omit optional fields fromrequiredand sometimes setadditionalProperties: truefor open param bags._make_strict_schema()rewrites the schema recursively (including inside$defs) without mutating the input, so codex strict mode accepts it. There's a unit test for this.claude (
claude-sonnet-4-6, etc.)--bareis not used because it requiresANTHROPIC_API_KEY, which is exactly what an OAuth-subscription user does not have.--tools \"\"and--disable-slash-commandscut tool-discovery overhead.--effort lowis exposed as a constructor knob; a synalinks Trainer/Optimizer can dial it up per call.--json-schema: contrary to codex, claude's structured-output mode adds latency on this path. We fall back to prompt-based JSON extraction (instructive system prompt + balanced-brace + fenced-code parser).gemini
with
HOMErewritten to a minimal isolated directory:~/.gemini/may load hooks, skills, agents, andenableAgents=true, adding tens of seconds of cold-start overhead per invocation. A baseline gemini call against a real user HOME measured ~63s; with the isolated HOME it drops to 8–16s.ensure_minimal_gemini_home()creates~/.synalinks_gemini_home/.gemini/(or\$SYNALINKS_GEMINI_HOMEif set), symlinks just the OAuth credential files (oauth_creds.json,google_accounts.json,installation_id), and writes a minimalsettings.jsonwithenableAgents=falseandide.enabled=false. Idempotent and side-effect-free across calls.-mis mandatory — without it, gemini falls back to a default that triples latency.-e \"\"disables extensions;--approval-mode yolobypasses the interactive approval prompt.Why this shape, not a single "CLI runner"
The three CLIs differ enough that a unified abstraction would either lose the optimizations (slow) or expose ten knobs that only apply to one provider. Keeping three small command builders + one async dispatcher is clearer and lets reviewers verify each branch in isolation.
Trade-offs / honest caveats
These are documented in the module docstring; flagging them here too:
LanguageModel(HTTP keep-alive) is preferable. This adapter is positioned as the OAuth fallback path, not the primary one.streaming=Trueraises explicitly. If you need streaming on OAuth, the right answer is to fix it inside the CLI vendors, not here.0.0. OAuth subscriptions are flat-rate; per-call cost is genuinely unknown. We do not fake a number —last_call_costandcumulated_coststay at zero, soTrainer/Optimizerpaths that sum cost will see zero. Documented in the docstring.<cli> auth refreshproactively; we removed that — it added 1 file-read+JSON-parse to every call to save 1 retry per ~hour when a token expires. The CLIs handle their own refresh internally; net latency win is on the side of removal.PATHshould manage that themselves.Pros / Cons / Value
Value (why this PR specifically)
LanguageModelcannot reach today.Generator,ChainOfThought,FunctionCallingAgentaccepts it without modification — we exercised this downstream before opening the PR.Pros
LanguageModel. No behavior change for current users.asyncio,subprocess,tempfile,pathlib,re,json,time,logging).LanguageModelAPI surface: same__call__(messages, schema, streaming), sameget_config/from_config, sameregister_synalinks_serializabledecorator.Cons / honest caveats
(See Trade-offs above.)
Alternatives considered
LanguageModelas the documented extension point (seeALL_OBJECTSinlanguage_models/__init__.py).LanguageModel. Rejected as out of scope for this PR. If maintainers want, the_extract_json/_make_strict_schemahelpers are lift-and-shift candidates for a follow-up refactor.Testing
Colocated test file:
synalinks/src/language_models/oauth_cli_language_model_test.pytest_unknown_provider_returns_error_string—ask_llm_via_cli(provider=\"bogus\")returns the error sentinel, no subprocess spawned.test_make_strict_schema_recursive— feeds a nested object schema withadditionalProperties=Trueand verifies the output sets it toFalseeverywhere (including inside\$defs) and adds every property torequired. Also verifies the input is not mutated.test_extract_json_balanced_and_fenced— parses three text variants (raw JSON,```jsonfence, leading prose + balanced braces) all to the same dict; verifies the no-JSON case returnsNone.test_serialization_roundtrip— instantiates with all knobs, callsget_config()thenfrom_config(config), and verifies every field round-trips. Also asserts the canonicalmodelstring isoauth/{provider}/{model}.test_default_model_string_when_empty— empty model →oauth/{provider}/default.The full
synalinks/src/language_models/suite (10 tests, including the 5 pre-existingLanguageModeltests) passes locally:uvx ruff checkanduvx ruff format --checkclean.Compatibility
LanguageModelis untouched.OAuthCLILanguageModel— they are not imported.synalinks.LanguageModelis unchanged; the new class is additive atsynalinks.OAuthCLILanguageModel.Files
synalinks/src/language_models/oauth_cli_language_model.py(new, ~660 lines)synalinks/src/language_models/oauth_cli_language_model_test.py(new, ~95 lines)synalinks/src/language_models/__init__.py(+3 lines: import + add toALL_OBJECTS)No regenerated
synalinks/api/stubs in this commit — happy to add them in a follow-up commit if maintainers want them landed in the same PR. The@synalinks_exportdecorators dispatch at import time so the public path works without the regenerated stubs.Naming / API bikeshed
Open to feedback on:
OAuthCLILanguageModelvsCLILanguageModelvsSubprocessLanguageModel. Picked the OAuth-explicit name because that's the actual differentiator vs litellm.SYNALINKS_CLI_*to stay namespaced and avoid clashes. Open toSYNALINKS_OAUTH_CLI_*if maintainers prefer.synalinks.language_models.get_oauth_cli_language_model. Could go on the class asOAuthCLILanguageModel.from_env()instead.Happy to iterate on any of these.