fix(acp): drain message events before returning end_turn#25683
fix(acp): drain message events before returning end_turn#25683truenorth-lj wants to merge 1 commit intoanomalyco:devfrom
Conversation
`Agent.prompt()` returns `stopReason: "end_turn"` as soon as
`sdk.session.prompt()` resolves, but `message.part.delta` events for the
final assistant message text are still queued in the SDK event stream at
that moment. They get processed by `runEventSubscription` and forwarded
to ACP as `agent_message_chunk` frames AFTER the RPC reply has already
been sent — a protocol violation visible to ACP clients as text
appearing post-end_turn.
Cause: two independent async paths share the same ACP wire.
- Path A — event subscription (`runEventSubscription`): consumes
`sdk.global.event()` and forwards `message.part.delta` as
`agent_message_chunk` via `connection.sessionUpdate(...)`.
- Path B — prompt RPC: `await sdk.session.prompt(...)` resolves when
the LLM finishes, then immediately returns `end_turn`.
Path B can return before Path A drains the trailing deltas. Order on the
wire is then: ... earlier chunks ... → end_turn reply → trailing chunk.
Fix: in `prompt()`, after `sdk.session.prompt()` resolves, await the
`message.updated` event for the response message id (i.e. `info.time.completed`
set). Because `runEventSubscription` processes events sequentially via
`for await` and awaits each `handleEvent`, the `message.updated`
(completed) event for a message is necessarily processed AFTER all
prior `message.part.delta` events for the same message — so waiting on
it guarantees every chunk has already been forwarded.
A 5s timeout fallback prevents deadlock if the upstream completion
event is never observed.
Repro:
Send a streaming prompt via ACP. Inspect the wire (DevTools → WS
Messages). Observe that the final `agent_message_chunk` (the
agent's last text delta) arrives 5–50ms AFTER the RPC reply with
`stopReason: end_turn` and matching `id`.
Affects every ACP client that gates UI / input on end_turn (e.g.
disables streaming indicator, re-enables send button) — they snap to
"done" prematurely while text is still being appended.
|
@kitlangton — would you have a moment to look? This is a re-open of #25422 (auto-closed by the bot 2h after open because the description didn't follow the PR template; same branch / same commit, just retemplated). The fix is small (+53/−0, no deletions) and verified end-to-end on a real ACP client: rebuilt the binary, captured WS frames, confirmed Happy to add a regression test using the existing |
anomalyco/opencode#25683 (acp drain end_turn), #25680 (hashline schema+args), openai/codex#20912 (synchronize agent control tools).
Issue for this PR
Closes #25421
Type of change
What does this PR do?
Agent.prompt()returnsstopReason: "end_turn"as soon assdk.session.prompt()resolves. But trailingmessage.part.deltaevents for the assistant's final text deltas are still in the SDK event stream at that moment — they get processed byrunEventSubscriptionand forwarded asagent_message_chunkto the ACP connection AFTER the prompt RPC reply has been written. On the wire you see chunks landing 5–50ms afterid:N result:end_turn.Why the fix works:
runEventSubscriptionis a sequentialfor awaitthat awaits eachhandleEventcall (which awaitsconnection.sessionUpdate). Themessage.updatedevent withinfo.time.completedset is emitted AFTER allmessage.part.deltaevents for the same message. So ifprompt()waits for that completion event before returning, every prior chunk has already been flushed to the ACP connection.The patch adds a per-messageId waiter resolved by a new
case "message.updated"handler, and awaits it aftersdk.session.prompt(...)resolves in both non-compact branches ofprompt(). A 5s timeout falls through if the completion event never arrives, so a dropped event can't deadlock the turn.The compact path doesn't carry an assistant message id, left untouched.
How did you verify your code works?
bun run typecheckcleantest/acp/event-subscription.test.ts(11 tests) all pass — no regressionstn-clawsidecar): rebuilt the binary, captured WS frames in DevTools, confirmedend_turnis now the lastch:acpframe for the turn (3s quiet window observed empty)Pre-fix wire trace:
Post-fix: the trailing chunk above is gone —
end_turnis the final frame.Happy to add a unit regression test using the existing
createFakeAgentharness if reviewers want.Screenshots / recordings
N/A — backend protocol fix. Wire trace included above.
Checklist