Skip to content

fix(acp): drain message events before returning end_turn#25683

Open
truenorth-lj wants to merge 1 commit intoanomalyco:devfrom
truenorth-lj:fix-acp-end-turn-race
Open

fix(acp): drain message events before returning end_turn#25683
truenorth-lj wants to merge 1 commit intoanomalyco:devfrom
truenorth-lj:fix-acp-end-turn-race

Conversation

@truenorth-lj
Copy link
Copy Markdown

Issue for this PR

Closes #25421

Type of change

  • Bug fix
  • New feature
  • Refactor / code improvement
  • Documentation

What does this PR do?

Agent.prompt() returns stopReason: "end_turn" as soon as sdk.session.prompt() resolves. But trailing message.part.delta events for the assistant's final text deltas are still in the SDK event stream at that moment — they get processed by runEventSubscription and forwarded as agent_message_chunk to the ACP connection AFTER the prompt RPC reply has been written. On the wire you see chunks landing 5–50ms after id:N result:end_turn.

Why the fix works: runEventSubscription is a sequential for await that awaits each handleEvent call (which awaits connection.sessionUpdate). The message.updated event with info.time.completed set is emitted AFTER all message.part.delta events for the same message. So if prompt() waits for that completion event before returning, every prior chunk has already been flushed to the ACP connection.

The patch adds a per-messageId waiter resolved by a new case "message.updated" handler, and awaits it after sdk.session.prompt(...) resolves in both non-compact branches of prompt(). A 5s timeout falls through if the completion event never arrives, so a dropped event can't deadlock the turn.

The compact path doesn't carry an assistant message id, left untouched.

How did you verify your code works?

  • bun run typecheck clean
  • Existing test/acp/event-subscription.test.ts (11 tests) all pass — no regressions
  • End-to-end on a real ACP client (TrueNorth's tn-claw sidecar): rebuilt the binary, captured WS frames in DevTools, confirmed end_turn is now the last ch:acp frame for the turn (3s quiet window observed empty)

Pre-fix wire trace:

... agent_message_chunk frames ...
{"id":2,"result":{"stopReason":"end_turn",...}}
{"method":"session/update","params":{...,"update":{"sessionUpdate":"agent_message_chunk","content":{"text":"\n\nDone. Created `/path/to/file`"}}}}   <-- arrives 10ms after end_turn

Post-fix: the trailing chunk above is gone — end_turn is the final frame.

Happy to add a unit regression test using the existing createFakeAgent harness if reviewers want.

Screenshots / recordings

N/A — backend protocol fix. Wire trace included above.

Checklist

  • I have tested my changes locally
  • I have not included unrelated changes in this PR

`Agent.prompt()` returns `stopReason: "end_turn"` as soon as
`sdk.session.prompt()` resolves, but `message.part.delta` events for the
final assistant message text are still queued in the SDK event stream at
that moment. They get processed by `runEventSubscription` and forwarded
to ACP as `agent_message_chunk` frames AFTER the RPC reply has already
been sent — a protocol violation visible to ACP clients as text
appearing post-end_turn.

Cause: two independent async paths share the same ACP wire.

  - Path A — event subscription (`runEventSubscription`): consumes
    `sdk.global.event()` and forwards `message.part.delta` as
    `agent_message_chunk` via `connection.sessionUpdate(...)`.
  - Path B — prompt RPC: `await sdk.session.prompt(...)` resolves when
    the LLM finishes, then immediately returns `end_turn`.

Path B can return before Path A drains the trailing deltas. Order on the
wire is then: ... earlier chunks ... → end_turn reply → trailing chunk.

Fix: in `prompt()`, after `sdk.session.prompt()` resolves, await the
`message.updated` event for the response message id (i.e. `info.time.completed`
set). Because `runEventSubscription` processes events sequentially via
`for await` and awaits each `handleEvent`, the `message.updated`
(completed) event for a message is necessarily processed AFTER all
prior `message.part.delta` events for the same message — so waiting on
it guarantees every chunk has already been forwarded.

A 5s timeout fallback prevents deadlock if the upstream completion
event is never observed.

Repro:

  Send a streaming prompt via ACP. Inspect the wire (DevTools → WS
  Messages). Observe that the final `agent_message_chunk` (the
  agent's last text delta) arrives 5–50ms AFTER the RPC reply with
  `stopReason: end_turn` and matching `id`.

Affects every ACP client that gates UI / input on end_turn (e.g.
disables streaming indicator, re-enables send button) — they snap to
"done" prematurely while text is still being appended.
@truenorth-lj
Copy link
Copy Markdown
Author

@kitlangton — would you have a moment to look? This is a re-open of #25422 (auto-closed by the bot 2h after open because the description didn't follow the PR template; same branch / same commit, just retemplated).

The fix is small (+53/−0, no deletions) and verified end-to-end on a real ACP client: rebuilt the binary, captured WS frames, confirmed end_turn is now the last ch:acp frame for the turn (3s quiet window observed empty post-fix; pre-fix had agent_message_chunk arriving 5–50ms after the RPC reply).

Happy to add a regression test using the existing createFakeAgent harness in test/acp/event-subscription.test.ts if you'd like one before merge — left it out to keep the diff focused, but the harness already supports event injection so it's straightforward.

Bojun-Vvibe added a commit to Bojun-Vvibe/oss-contributions that referenced this pull request May 4, 2026
anomalyco/opencode#25683 (acp drain end_turn), #25680 (hashline schema+args), openai/codex#20912 (synchronize agent control tools).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ACP agent_message_chunk frames land after end_turn RPC reply due to event-subscription / prompt-RPC race

1 participant