SDK sendAndWait hangs indefinitely in tool-call loops once cumulative tool-result payload crosses a threshold


---

## Summary

Using `@github/copilot-sdk` to drive a sequential tool-call loop, the SDK
silently hangs after a few tool iterations once tool result payloads grow
past a small threshold. The hang is wedged inside the model call:
`assistant.turn_start` fires, then **no further events ever arrive** —
no `assistant.message`, no `assistant.turn_end`, no `session.error`.

The `timeoutMs` parameter to `session.sendAndWait(opts, timeoutMs)` is
**ignored** while wedged. The only recovery is killing the process or
applying an outer `AbortController` / `Promise.race` timeout.

This is reproducible with a ~120 line script using only `@github/copilot-sdk`,
no other dependencies, no dehydration / resume, no real tools.

## Affected

- `@github/copilot-sdk` (latest, as bundled with current Copilot CLI)
- Models confirmed: `claude-sonnet-4.6`, `claude-haiku-4.5`, `gpt-5.4`
- Both `claude-opus-4.7` (older) and current `claude-opus-4.6` series
- Node `v24.14.0` on macOS 14, also reproduces in containerized Linux

Claude is more sensitive (smaller threshold) but GPT wedges identically
once payloads × turn count get larger. **Not model-specific.**

## Threshold matrix (clean repro, single fresh session, sequential tool loop)

| Model | 500 B per call | 2 KB per call | 5 KB per call |
|---|---|---|---|
| `claude-sonnet-4.6` | ✅ 20/20 in 58 s | ❌ Hung at call **12** | ❌ Hung at call **4** |
| `gpt-5.4` | ✅ 20/20 in 81 s | ✅ 20/20 in 169 s | ❌ Hung at call **9** |

`N=20` shards, single tool, model is told to call shard=1..N then return the sum.
Tool returns in <1 ms with a synthetic JSON `{ shard, value, padding: "x".repeat(...) }`.

When wedged: `[HANG] elapsed=473916ms` was observed while still hung — i.e. the
SDK held the request open ~8 minutes despite a 180 s `sendAndWait` timeout
parameter.

## Symptom (event trace, anonymized)

```
[HH:MM:SS] >>> turn_start #4
[HH:MM:SS] msg len=0
[HH:MM:SS] tool#3 shard=3 value=52
[HH:MM:SS] <<< turn_end (chars total: 0)
[HH:MM:SS] >>> turn_start #5
                                    ← hangs here, indefinitely
                                    ← no assistant.message
                                    ← no assistant.turn_end
                                    ← no session.error
                                    ← sendAndWait timeout ignored
```

## Minimal reproduction (~50 lines, no extra deps)

```js
import { CopilotClient, approveAll, defineTool } from "@github/copilot-sdk";

const N = 20;
const PAYLOAD = 5000; // bytes per tool return
const MODEL = "claude-sonnet-4.6";

const stepTool = defineTool("get_step_data", {
  description:
    `Fetch the next data shard. Call shard=1, then 2, ... until ${N}, ` +
    `then output ONE message with the sum. Don't summarize between calls.`,
  parameters: {
    type: "object",
    properties: { shard: { type: "integer" } },
    required: ["shard"],
  },
  handler: async ({ shard }) => JSON.stringify({
    shard,
    value: ((shard * 17) % 100) + 1,
    padding: "x".repeat(PAYLOAD - 60),
  }),
});

const client = new CopilotClient({
  autoStart: true,
  githubToken: process.env.GITHUB_TOKEN,
});
await client.start();

const session = await client.createSession({
  model: MODEL,
  tools: [stepTool],
  onPermissionRequest: approveAll,
  systemPrompt:
    "You are a sequential data fetcher. Call get_step_data with shard=1, " +
    "then shard=2, ..., then output ONE final message with the sum and stop.",
});

session.on((ev) => {
  const ts = new Date().toISOString().slice(11, 23);
  if (ev.type === "assistant.turn_start") console.log(`[${ts}] turn_start`);
  if (ev.type === "assistant.message") console.log(`[${ts}] msg ${ev.data?.content?.slice(0,60)}`);
  if (ev.type === "assistant.turn_end") console.log(`[${ts}] turn_end`);
});

// Outer timeout because sendAndWait's own timeout is ignored when wedged
const timeoutPromise = new Promise((_, reject) =>
  setTimeout(() => reject(new Error("HUNG")), 180_000));

try {
  const result = await Promise.race([
    session.sendAndWait({ prompt: `Fetch all ${N} shards and sum them.` }, 180_000),
    timeoutPromise,
  ]);
  console.log("OK:", result?.data?.content);
} catch (e) {
  console.log("HUNG:", e.message);
}

await client.stop();
process.exit(0);
```

Save as `loop.mjs`, run `MODEL=claude-sonnet-4.6 GITHUB_TOKEN=ghp_... node loop.mjs`.

To sweep:
```sh
for MODEL in claude-sonnet-4.6 gpt-5.4; do
  for PAYLOAD in 500 2000 5000; do
    echo "==== $MODEL payload=${PAYLOAD}B ===="
    MODEL=$MODEL PAYLOAD=$PAYLOAD node loop.mjs
  done
done
```

## Expected behavior

1. The model either streams an `assistant.message` / completes the turn, **or**
2. The SDK surfaces an error event and the `sendAndWait` promise rejects within
   the `timeoutMs` parameter the caller passed.

## Actual behavior

Neither happens. The SDK silently waits forever for an upstream stream that
never emits. The `timeoutMs` argument is not enforced when the wedge is
inside the provider stream.

## What we ruled out

- **Single large tool result.** A one-shot test injecting 40 KB / 100 KB /
  250 KB / 500 KB base64 tool returns into a fresh session completes
  cleanly in 13–22 s on `claude-opus-4.7`. The hang requires a **loop** of
  moderately-sized returns.
- **Conversation history size alone.** Resuming a 326-event / 1.8 MB
  `events.jsonl` session with a tiny prompt completes in 4 s.
- **Tool handler latency.** Repro tool returns in <1 ms.
- **Model family.** GPT-5.4 wedges at the same symptom, just at higher
  payload thresholds than Claude.

## Likely culprits (from the outside)

- Provider streaming adapter not enforcing read deadlines on the upstream
  SSE / chunked response.
- A specific tool-use repeat pattern that produces an "empty turn" the
  CLI keeps waiting on.
- Cumulative tool-result payload pushing the request close to a
  provider-side limit that returns no error and no terminator.

## What would help us

1. An enforced internal read deadline on the upstream model stream that
   surfaces a `session.error` instead of waiting forever.
2. The `timeoutMs` parameter to `sendAndWait` actually unsticking the
   request (it currently doesn't).
3. A way to pass an `AbortSignal` into `sendAndWait` so callers can cancel.

## Workarounds we use

- Wrap every `sendAndWait` in `Promise.race(sendAndWait, externalTimeout)`
  so we at least know it hung.
- Forcibly kill the worker process to release the request and let our
  durable-execution layer redeliver the turn.
- Spill large tool returns to blob storage and pass back a small pointer,
  to keep cumulative tool-result payload below the wedge threshold.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SDK sendAndWait hangs indefinitely in tool-call loops once cumulative tool-result payload crosses a threshold #2911

Summary

Affected

Threshold matrix (clean repro, single fresh session, sequential tool loop)

Symptom (event trace, anonymized)

Minimal reproduction (~50 lines, no extra deps)

Expected behavior

Actual behavior

What we ruled out

Likely culprits (from the outside)

What would help us

Workarounds we use

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Model	500 B per call	2 KB per call	5 KB per call
`claude-sonnet-4.6`	✅ 20/20 in 58 s	❌ Hung at call 12	❌ Hung at call 4
`gpt-5.4`	✅ 20/20 in 81 s	✅ 20/20 in 169 s	❌ Hung at call 9

SDK sendAndWait hangs indefinitely in tool-call loops once cumulative tool-result payload crosses a threshold #2911

Description

Summary

Affected

Threshold matrix (clean repro, single fresh session, sequential tool loop)

Symptom (event trace, anonymized)

Minimal reproduction (~50 lines, no extra deps)

Expected behavior

Actual behavior

What we ruled out

Likely culprits (from the outside)

What would help us

Workarounds we use

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions