review: drip-78 batch 1 — sst/opencode async command_async + orchestrate subagent, litellm chatgpt-responses SSE non-stream output synthesis

Bojun-Vvibe · Bojun-Vvibe · commit fc5786cb4eda · 2026-04-26T20:42:35.000+08:00
- anomalyco/opencode#24459 merge-after-nits: 204-fast async command endpoint, drive-by import migration to split out - anomalyco/opencode#24450 needs-discussion: orchestrate subagent semantics overlap with direct Task fanout - BerriAI/litellm#26549 request-changes: 80-line transformation.py fix bundled with 31 _experimental/out HTML files + next bump
diff --git a/reviews/2026-W17/drip-78/BerriAI-litellm-pr-26549.md b/reviews/2026-W17/drip-78/BerriAI-litellm-pr-26549.md
@@ -0,0 +1,45 @@
+# BerriAI/litellm PR #26549 — Fix/chatgpt gpt5.4 nonstream output
+
+- **PR:** https://github.com/BerriAI/litellm/pull/26549
+- **Head SHA:** `c5234f3db5f59bc10ab1e8525f9eff9be851f612`
+- **Files:** 39 (+145 / -53)
+- **Verdict:** `request-changes`
+
+## What it does (substantive part)
+
+In `litellm/llms/chatgpt/responses/transformation.py`, `transform_response_api_response`
+now (a) accumulates `response.output_text.delta` chunk text into `output_text_parts`
+and (b) when the final `response.completed` event arrives with an empty `output: []`
+field but `output_text_parts` is non-empty, synthesizes a single
+`{"type":"message","role":"assistant","content":[{"type":"output_text","text":"..."}]}`
+entry on the `completed_response_payload`. Test
+`test_chatgpt_non_stream_sse_response_parsing_empty_completed_output` at
+`tests/test_litellm/llms/chatgpt/responses/test_chatgpt_responses_transformation.py:201-235`
+covers the deltas-then-empty-completed shape.
+
+## Specific reads
+
+- `transformation.py:135-137` — adds `completed_response_payload = None` and `output_text_parts = []` accumulators. Clean.
+- `transformation.py:155-159` — `OUTPUT_TEXT_DELTA` branch does `if isinstance(content_part, str) and content_part: output_text_parts.append(content_part)`; falsy/None deltas are correctly skipped. The `continue` is correct.
+- `transformation.py:165` — `completed_response_payload = response_payload` is captured *after* the `dict(response_payload)` shallow copy. Good — later mutation won't leak into the original.
+- `transformation.py:189-218` — fallback synthesis only fires when `not completed_response_payload.get("output") and len(output_text_parts) > 0`. Both gates are correct: skip fallback if upstream filled `output`, skip if no text accumulated.
+- The `transformation.py:209` re-assigns `created_at` *again* via `_safe_convert_created_field` — but the earlier branch at `transformation.py:172` already converted it on the same payload object. Now it's converted twice. `_safe_convert_created_field` is presumably idempotent on already-converted values, but worth confirming.
+- `transformation.py:215-218` — `try: ResponsesAPIResponse(**payload) except: model_construct(...)`. Bare `except` swallows `KeyboardInterrupt`/`SystemExit` — should be `except Exception`.
+
+## Why request-changes
+
+The transformation fix is correct and the new test is well-shaped. The blocking issue is the **PR scope**:
+
+1. **37 unrelated files** dragged in with the fix:
+   - 31× `litellm/proxy/_experimental/out/**/index.html` — static export bundle of the dashboard (`next export` output). These get regenerated on every dashboard build and should never live in a behavior-fix PR; they make the diff unreviewable and create noisy merge conflicts on every dashboard PR.
+   - `ui/litellm-dashboard/package-lock.json` + `package.json` — `next` bumped from `16.1.7` → `^16.2.4`. A Next.js point-release bump and a server-side SSE parser fix have no business in the same PR. Either is fine; together they're impossible to bisect.
+   - The 31 HTML files include compiled JS/CSS hashes — any reviewer trying to assess "did the dashboard change behaviorally" has to manually diff hashed bundles.
+2. **Split required**:
+   - **Commit A (this PR's value)**: the 80-line `transformation.py` change + the new test. Should land as one focused PR titled e.g. `fix(chatgpt-responses): synthesize output from output_text.delta when completed.output is empty`.
+   - **Commit B**: the `next` minor bump alone, with a one-line note about why.
+   - **Commit C**: the `_experimental/out/**` bundle regeneration. Honestly this should be a CI-managed artifact and not committed at all, but if it must be committed, it should be a maintainer's mechanical regeneration commit, not bundled with a fix.
+3. **Bare `except`** at `transformation.py:215`: tighten to `except Exception` so genuine interrupts propagate.
+4. **`_safe_convert_created_field` double-call** at `transformation.py:172` and `:209`: confirm idempotence in a comment, or guard with a `_already_converted` sentinel.
+5. **Test gap**: add a "deltas accumulated but `completed.output` is non-empty" case asserting the upstream output is preserved verbatim and the delta-buffer is *not* substituted in. The current code is correct on that branch, but a regression there would silently double-emit assistant text.
+
+The fix itself, isolated, is `merge-as-is`. As submitted, the 37-file unrelated payload makes me block.
diff --git a/reviews/2026-W17/drip-78/sst-opencode-pr-24450.md b/reviews/2026-W17/drip-78/sst-opencode-pr-24450.md
@@ -0,0 +1,31 @@
+# sst/opencode PR #24450 — feat(agent): add orchestrate subagent for task decomposition
+
+- **PR:** https://github.com/sst/opencode/pull/24450
+- **Head SHA:** `b05be71dfe2fa948f8d74c922e0fb963c182e483`
+- **Files:** 3 (+86 / -0)
+- **Verdict:** `needs-discussion`
+
+## What it does
+
+Registers a new built-in `orchestrate` subagent in `packages/opencode/src/agent/agent.ts:189-205`
+with a 66-line prompt at `packages/opencode/src/agent/prompt/orchestrate.txt`.
+The agent is granted `question:allow`, `task:allow`, `todowrite:allow` via
+`Permission.fromConfig({...})`, defaults `mode: "subagent"`, `native: true`,
+empty `options: {}`. The Task tool blurb at `packages/opencode/src/tool/task.txt:7,27`
+gets two new lines pointing toward `orchestrate` for parallelizable work.
+
+## Specific reads
+
+- `agent.ts:189-205` — registration block follows the `general`/`build`/`plan` template exactly; permission merge order `(defaults, fromConfig({...}), user)` is identical to siblings, so user `agent.json` overrides still work as expected.
+- `orchestrate.txt:42-45` — Agent Types Reference table lists `explore`, `general`, `build`, `plan`. This is hardcoded into the prompt — if a user adds a custom subagent or ships an `arena`/`compaction` agent, the orchestrator will pick from a stale list.
+- `orchestrate.txt:54-58` — example decomposition for "Migrate the authentication system to OAuth2" routes design to `plan` then implementation to two `general` calls. There's no guidance on how the orchestrator decides parallel-vs-sequential beyond a one-line "if Task B needs output from Task A, execute sequentially" — leaving the actual scheduler logic implicit.
+- `task.txt:27` — the new example agent description is inserted *above* `code-reviewer` and `greeting-responder` in the `<example_agent_descriptions>` block. That block is explicitly labeled "fictional examples for illustration only" — putting a real registered agent in the fictional-examples block is confusing.
+
+## Why needs-discussion
+
+1. **Task-tool semantics overlap**: `task.txt` already documents how the calling agent should invoke `Task(...)` directly. An "orchestrator" that itself fans out `Task(...)` calls adds a layer of indirection where a primary agent issues `Task(orchestrate)` which then issues N `Task(general)` calls. That's three context windows deep. What's the maintainer-intended boundary between "primary agent should just call Task() N times" vs "primary agent should delegate the planning to orchestrate"? The PR doesn't answer this and the prompt copy doesn't either.
+2. **Token cost**: each fanout multiplies token spend; `orchestrate` itself is a full LLM call to produce a 2-5 item plan. For a 2-subtask job this is strictly more expensive than the primary agent calling Task twice directly. Where's the break-even, and should `orchestrate` be gated to N≥3 subtasks?
+3. **Hardcoded agent table goes stale**: as noted, the prompt enumerates agent types. Better to inject the live agent list into the prompt at runtime (the way other meta-agents see the available subagents) so user-added agents are reachable.
+4. **`question:allow` for a subagent**: the orchestrate prompt never tells the agent to ask the user questions — it tells it to decompose and dispatch. Granting `question` to a non-interactive subagent invites mid-orchestration prompts that break the parent flow. Consider `question: "deny"` unless there's a deliberate reason.
+5. **No tests / no eval**: an orchestration meta-agent is exactly the kind of feature that needs a deterministic eval (input task → expected number+shape of subtasks) to prevent silent prompt regressions. None added.
+6. **Drive-by `task.txt` example placement**: as noted, listing a real agent in the "fictional examples" block is wrong. Move it to a real Available Agents section.
diff --git a/reviews/2026-W17/drip-78/sst-opencode-pr-24459.md b/reviews/2026-W17/drip-78/sst-opencode-pr-24459.md
@@ -0,0 +1,34 @@
+# sst/opencode PR #24459 — feat(opencode): add async command endpoint
+
+- **PR:** https://github.com/sst/opencode/pull/24459
+- **Head SHA:** `4dcbef32a1ce49cb808ffd37814821b352c7d336`
+- **Files:** 4 (+154 / -4)
+- **Verdict:** `merge-after-nits`
+
+## What it does
+
+Adds `POST /session/:sessionID/command_async` mirroring the existing async session
+routes. Handler at `packages/opencode/src/server/routes/instance/session.ts:965-1003`
+delegates to `SessionPrompt.Service.command(...)` via `runRequest(...)`, returns
+`204 No Content` immediately, and on background failure publishes
+`Session.Event.Error` with a `NamedError.Unknown`. Compression middleware at
+`packages/opencode/src/server/middleware.ts:90` is updated to also exclude the
+new route from gzip (correct — async endpoints have empty bodies, gzip is a
+small CPU hit for nothing). SDK `gen/sdk.gen.ts` adds a `commandAsync(...)`
+method and `gen/types.gen.ts` adds the matching `SessionCommandAsync*` type
+quartet.
+
+## Specific reads
+
+- `middleware.ts:90` — regex now `^/session/[^/]+/(message|prompt_async|command_async)$`. Pattern is consistent with the `prompt_async` precedent.
+- `session.ts:984-997` — `void runRequest(...).catch(err => log.error + Bus.publish(Session.Event.Error, ...))`. The `void` keyword is intentional and necessary so the Hono handler isn't accidentally `await`ed. Good.
+- `session.ts:990` — error is wrapped with `err instanceof Error ? err.message : String(err)`. Pragmatic.
+- The two `NamedError` import paths (`session.ts:28`, `middleware.ts:2`) silently switch from `@opencode-ai/core/util/error` → `@opencode-ai/shared/util/error`. That's a drive-by import-path migration that should ideally land in a separate refactor commit, not folded into a feature PR — easy to miss in review and harder to revert.
+
+## Nits before merge
+
+1. **Drive-by import migration**: split the two `NamedError` import-path swaps and the `Flag` import swap (`middleware.ts:11` → `@/flag/flag`) out into their own chore commit so the feature is reviewable in isolation.
+2. **No test added**: the existing `prompt_async` route presumably has a test that asserts (a) returns `204` quickly, (b) error from background work surfaces via `Session.Event.Error`. Mirror it for `command_async` — both branches are critical and currently untested.
+3. **`runRequest`'s second arg is `c`**: the Hono `Context` is captured into a fire-and-forget background task. Confirm `c` doesn't hold request-scoped state (e.g. an aborted `AbortSignal`) that becomes invalid the moment the handler returns `204`. If `runRequest` reads anything off `c.req` after the handler exits, the request body may already be GC'd.
+4. **OpenAPI `204` vs error events**: clients that get `204` then receive a delayed `Session.Event.Error` need a way to correlate the two. Document in the route description that callers should subscribe to `Session.Event.Error` filtered by `sessionID` *before* posting, or include a `messageID` in the response body (still `204` as a header-only ack with `Location:` is also a pattern worth considering).
+5. **`compression` carve-out comment**: at `middleware.ts:90` add an inline comment that the route returns `204` with no body so gzip is wasted work, mirroring whatever's documented for `prompt_async`. Otherwise the regex grows silently.