Skip to content

Commit fc5786c

Browse files
committed
review: drip-78 batch 1 — sst/opencode async command_async + orchestrate subagent, litellm chatgpt-responses SSE non-stream output synthesis
- anomalyco/opencode#24459 merge-after-nits: 204-fast async command endpoint, drive-by import migration to split out - anomalyco/opencode#24450 needs-discussion: orchestrate subagent semantics overlap with direct Task fanout - BerriAI/litellm#26549 request-changes: 80-line transformation.py fix bundled with 31 _experimental/out HTML files + next bump
1 parent fb808d8 commit fc5786c

3 files changed

Lines changed: 110 additions & 0 deletions

File tree

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
# BerriAI/litellm PR #26549 — Fix/chatgpt gpt5.4 nonstream output
2+
3+
- **PR:** https://github.com/BerriAI/litellm/pull/26549
4+
- **Head SHA:** `c5234f3db5f59bc10ab1e8525f9eff9be851f612`
5+
- **Files:** 39 (+145 / -53)
6+
- **Verdict:** `request-changes`
7+
8+
## What it does (substantive part)
9+
10+
In `litellm/llms/chatgpt/responses/transformation.py`, `transform_response_api_response`
11+
now (a) accumulates `response.output_text.delta` chunk text into `output_text_parts`
12+
and (b) when the final `response.completed` event arrives with an empty `output: []`
13+
field but `output_text_parts` is non-empty, synthesizes a single
14+
`{"type":"message","role":"assistant","content":[{"type":"output_text","text":"..."}]}`
15+
entry on the `completed_response_payload`. Test
16+
`test_chatgpt_non_stream_sse_response_parsing_empty_completed_output` at
17+
`tests/test_litellm/llms/chatgpt/responses/test_chatgpt_responses_transformation.py:201-235`
18+
covers the deltas-then-empty-completed shape.
19+
20+
## Specific reads
21+
22+
- `transformation.py:135-137` — adds `completed_response_payload = None` and `output_text_parts = []` accumulators. Clean.
23+
- `transformation.py:155-159``OUTPUT_TEXT_DELTA` branch does `if isinstance(content_part, str) and content_part: output_text_parts.append(content_part)`; falsy/None deltas are correctly skipped. The `continue` is correct.
24+
- `transformation.py:165``completed_response_payload = response_payload` is captured *after* the `dict(response_payload)` shallow copy. Good — later mutation won't leak into the original.
25+
- `transformation.py:189-218` — fallback synthesis only fires when `not completed_response_payload.get("output") and len(output_text_parts) > 0`. Both gates are correct: skip fallback if upstream filled `output`, skip if no text accumulated.
26+
- The `transformation.py:209` re-assigns `created_at` *again* via `_safe_convert_created_field` — but the earlier branch at `transformation.py:172` already converted it on the same payload object. Now it's converted twice. `_safe_convert_created_field` is presumably idempotent on already-converted values, but worth confirming.
27+
- `transformation.py:215-218``try: ResponsesAPIResponse(**payload) except: model_construct(...)`. Bare `except` swallows `KeyboardInterrupt`/`SystemExit` — should be `except Exception`.
28+
29+
## Why request-changes
30+
31+
The transformation fix is correct and the new test is well-shaped. The blocking issue is the **PR scope**:
32+
33+
1. **37 unrelated files** dragged in with the fix:
34+
- 31× `litellm/proxy/_experimental/out/**/index.html` — static export bundle of the dashboard (`next export` output). These get regenerated on every dashboard build and should never live in a behavior-fix PR; they make the diff unreviewable and create noisy merge conflicts on every dashboard PR.
35+
- `ui/litellm-dashboard/package-lock.json` + `package.json``next` bumped from `16.1.7``^16.2.4`. A Next.js point-release bump and a server-side SSE parser fix have no business in the same PR. Either is fine; together they're impossible to bisect.
36+
- The 31 HTML files include compiled JS/CSS hashes — any reviewer trying to assess "did the dashboard change behaviorally" has to manually diff hashed bundles.
37+
2. **Split required**:
38+
- **Commit A (this PR's value)**: the 80-line `transformation.py` change + the new test. Should land as one focused PR titled e.g. `fix(chatgpt-responses): synthesize output from output_text.delta when completed.output is empty`.
39+
- **Commit B**: the `next` minor bump alone, with a one-line note about why.
40+
- **Commit C**: the `_experimental/out/**` bundle regeneration. Honestly this should be a CI-managed artifact and not committed at all, but if it must be committed, it should be a maintainer's mechanical regeneration commit, not bundled with a fix.
41+
3. **Bare `except`** at `transformation.py:215`: tighten to `except Exception` so genuine interrupts propagate.
42+
4. **`_safe_convert_created_field` double-call** at `transformation.py:172` and `:209`: confirm idempotence in a comment, or guard with a `_already_converted` sentinel.
43+
5. **Test gap**: add a "deltas accumulated but `completed.output` is non-empty" case asserting the upstream output is preserved verbatim and the delta-buffer is *not* substituted in. The current code is correct on that branch, but a regression there would silently double-emit assistant text.
44+
45+
The fix itself, isolated, is `merge-as-is`. As submitted, the 37-file unrelated payload makes me block.
Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,31 @@
1+
# sst/opencode PR #24450 — feat(agent): add orchestrate subagent for task decomposition
2+
3+
- **PR:** https://github.com/sst/opencode/pull/24450
4+
- **Head SHA:** `b05be71dfe2fa948f8d74c922e0fb963c182e483`
5+
- **Files:** 3 (+86 / -0)
6+
- **Verdict:** `needs-discussion`
7+
8+
## What it does
9+
10+
Registers a new built-in `orchestrate` subagent in `packages/opencode/src/agent/agent.ts:189-205`
11+
with a 66-line prompt at `packages/opencode/src/agent/prompt/orchestrate.txt`.
12+
The agent is granted `question:allow`, `task:allow`, `todowrite:allow` via
13+
`Permission.fromConfig({...})`, defaults `mode: "subagent"`, `native: true`,
14+
empty `options: {}`. The Task tool blurb at `packages/opencode/src/tool/task.txt:7,27`
15+
gets two new lines pointing toward `orchestrate` for parallelizable work.
16+
17+
## Specific reads
18+
19+
- `agent.ts:189-205` — registration block follows the `general`/`build`/`plan` template exactly; permission merge order `(defaults, fromConfig({...}), user)` is identical to siblings, so user `agent.json` overrides still work as expected.
20+
- `orchestrate.txt:42-45` — Agent Types Reference table lists `explore`, `general`, `build`, `plan`. This is hardcoded into the prompt — if a user adds a custom subagent or ships an `arena`/`compaction` agent, the orchestrator will pick from a stale list.
21+
- `orchestrate.txt:54-58` — example decomposition for "Migrate the authentication system to OAuth2" routes design to `plan` then implementation to two `general` calls. There's no guidance on how the orchestrator decides parallel-vs-sequential beyond a one-line "if Task B needs output from Task A, execute sequentially" — leaving the actual scheduler logic implicit.
22+
- `task.txt:27` — the new example agent description is inserted *above* `code-reviewer` and `greeting-responder` in the `<example_agent_descriptions>` block. That block is explicitly labeled "fictional examples for illustration only" — putting a real registered agent in the fictional-examples block is confusing.
23+
24+
## Why needs-discussion
25+
26+
1. **Task-tool semantics overlap**: `task.txt` already documents how the calling agent should invoke `Task(...)` directly. An "orchestrator" that itself fans out `Task(...)` calls adds a layer of indirection where a primary agent issues `Task(orchestrate)` which then issues N `Task(general)` calls. That's three context windows deep. What's the maintainer-intended boundary between "primary agent should just call Task() N times" vs "primary agent should delegate the planning to orchestrate"? The PR doesn't answer this and the prompt copy doesn't either.
27+
2. **Token cost**: each fanout multiplies token spend; `orchestrate` itself is a full LLM call to produce a 2-5 item plan. For a 2-subtask job this is strictly more expensive than the primary agent calling Task twice directly. Where's the break-even, and should `orchestrate` be gated to N≥3 subtasks?
28+
3. **Hardcoded agent table goes stale**: as noted, the prompt enumerates agent types. Better to inject the live agent list into the prompt at runtime (the way other meta-agents see the available subagents) so user-added agents are reachable.
29+
4. **`question:allow` for a subagent**: the orchestrate prompt never tells the agent to ask the user questions — it tells it to decompose and dispatch. Granting `question` to a non-interactive subagent invites mid-orchestration prompts that break the parent flow. Consider `question: "deny"` unless there's a deliberate reason.
30+
5. **No tests / no eval**: an orchestration meta-agent is exactly the kind of feature that needs a deterministic eval (input task → expected number+shape of subtasks) to prevent silent prompt regressions. None added.
31+
6. **Drive-by `task.txt` example placement**: as noted, listing a real agent in the "fictional examples" block is wrong. Move it to a real Available Agents section.
Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# sst/opencode PR #24459 — feat(opencode): add async command endpoint
2+
3+
- **PR:** https://github.com/sst/opencode/pull/24459
4+
- **Head SHA:** `4dcbef32a1ce49cb808ffd37814821b352c7d336`
5+
- **Files:** 4 (+154 / -4)
6+
- **Verdict:** `merge-after-nits`
7+
8+
## What it does
9+
10+
Adds `POST /session/:sessionID/command_async` mirroring the existing async session
11+
routes. Handler at `packages/opencode/src/server/routes/instance/session.ts:965-1003`
12+
delegates to `SessionPrompt.Service.command(...)` via `runRequest(...)`, returns
13+
`204 No Content` immediately, and on background failure publishes
14+
`Session.Event.Error` with a `NamedError.Unknown`. Compression middleware at
15+
`packages/opencode/src/server/middleware.ts:90` is updated to also exclude the
16+
new route from gzip (correct — async endpoints have empty bodies, gzip is a
17+
small CPU hit for nothing). SDK `gen/sdk.gen.ts` adds a `commandAsync(...)`
18+
method and `gen/types.gen.ts` adds the matching `SessionCommandAsync*` type
19+
quartet.
20+
21+
## Specific reads
22+
23+
- `middleware.ts:90` — regex now `^/session/[^/]+/(message|prompt_async|command_async)$`. Pattern is consistent with the `prompt_async` precedent.
24+
- `session.ts:984-997``void runRequest(...).catch(err => log.error + Bus.publish(Session.Event.Error, ...))`. The `void` keyword is intentional and necessary so the Hono handler isn't accidentally `await`ed. Good.
25+
- `session.ts:990` — error is wrapped with `err instanceof Error ? err.message : String(err)`. Pragmatic.
26+
- The two `NamedError` import paths (`session.ts:28`, `middleware.ts:2`) silently switch from `@opencode-ai/core/util/error``@opencode-ai/shared/util/error`. That's a drive-by import-path migration that should ideally land in a separate refactor commit, not folded into a feature PR — easy to miss in review and harder to revert.
27+
28+
## Nits before merge
29+
30+
1. **Drive-by import migration**: split the two `NamedError` import-path swaps and the `Flag` import swap (`middleware.ts:11``@/flag/flag`) out into their own chore commit so the feature is reviewable in isolation.
31+
2. **No test added**: the existing `prompt_async` route presumably has a test that asserts (a) returns `204` quickly, (b) error from background work surfaces via `Session.Event.Error`. Mirror it for `command_async` — both branches are critical and currently untested.
32+
3. **`runRequest`'s second arg is `c`**: the Hono `Context` is captured into a fire-and-forget background task. Confirm `c` doesn't hold request-scoped state (e.g. an aborted `AbortSignal`) that becomes invalid the moment the handler returns `204`. If `runRequest` reads anything off `c.req` after the handler exits, the request body may already be GC'd.
33+
4. **OpenAPI `204` vs error events**: clients that get `204` then receive a delayed `Session.Event.Error` need a way to correlate the two. Document in the route description that callers should subscribe to `Session.Event.Error` filtered by `sessionID` *before* posting, or include a `messageID` in the response body (still `204` as a header-only ack with `Location:` is also a pattern worth considering).
34+
5. **`compression` carve-out comment**: at `middleware.ts:90` add an inline comment that the route returns `204` with no body so gzip is wasted work, mirroring whatever's documented for `prompt_async`. Otherwise the regex grows silently.

0 commit comments

Comments
 (0)