Summary
Using @github/copilot-sdk to drive a sequential tool-call loop, the SDK
silently hangs after a few tool iterations once tool result payloads grow
past a small threshold. The hang is wedged inside the model call:
assistant.turn_start fires, then no further events ever arrive —
no assistant.message, no assistant.turn_end, no session.error.
The timeoutMs parameter to session.sendAndWait(opts, timeoutMs) is
ignored while wedged. The only recovery is killing the process or
applying an outer AbortController / Promise.race timeout.
This is reproducible with a ~120 line script using only @github/copilot-sdk,
no other dependencies, no dehydration / resume, no real tools.
Affected
@github/copilot-sdk (latest, as bundled with current Copilot CLI)
- Models confirmed:
claude-sonnet-4.6, claude-haiku-4.5, gpt-5.4
- Both
claude-opus-4.7 (older) and current claude-opus-4.6 series
- Node
v24.14.0 on macOS 14, also reproduces in containerized Linux
Claude is more sensitive (smaller threshold) but GPT wedges identically
once payloads × turn count get larger. Not model-specific.
Threshold matrix (clean repro, single fresh session, sequential tool loop)
| Model |
500 B per call |
2 KB per call |
5 KB per call |
claude-sonnet-4.6 |
✅ 20/20 in 58 s |
❌ Hung at call 12 |
❌ Hung at call 4 |
gpt-5.4 |
✅ 20/20 in 81 s |
✅ 20/20 in 169 s |
❌ Hung at call 9 |
N=20 shards, single tool, model is told to call shard=1..N then return the sum.
Tool returns in <1 ms with a synthetic JSON { shard, value, padding: "x".repeat(...) }.
When wedged: [HANG] elapsed=473916ms was observed while still hung — i.e. the
SDK held the request open ~8 minutes despite a 180 s sendAndWait timeout
parameter.
Symptom (event trace, anonymized)
[HH:MM:SS] >>> turn_start #4
[HH:MM:SS] msg len=0
[HH:MM:SS] tool#3 shard=3 value=52
[HH:MM:SS] <<< turn_end (chars total: 0)
[HH:MM:SS] >>> turn_start #5
← hangs here, indefinitely
← no assistant.message
← no assistant.turn_end
← no session.error
← sendAndWait timeout ignored
Minimal reproduction (~50 lines, no extra deps)
import { CopilotClient, approveAll, defineTool } from "@github/copilot-sdk";
const N = 20;
const PAYLOAD = 5000; // bytes per tool return
const MODEL = "claude-sonnet-4.6";
const stepTool = defineTool("get_step_data", {
description:
`Fetch the next data shard. Call shard=1, then 2, ... until ${N}, ` +
`then output ONE message with the sum. Don't summarize between calls.`,
parameters: {
type: "object",
properties: { shard: { type: "integer" } },
required: ["shard"],
},
handler: async ({ shard }) => JSON.stringify({
shard,
value: ((shard * 17) % 100) + 1,
padding: "x".repeat(PAYLOAD - 60),
}),
});
const client = new CopilotClient({
autoStart: true,
githubToken: process.env.GITHUB_TOKEN,
});
await client.start();
const session = await client.createSession({
model: MODEL,
tools: [stepTool],
onPermissionRequest: approveAll,
systemPrompt:
"You are a sequential data fetcher. Call get_step_data with shard=1, " +
"then shard=2, ..., then output ONE final message with the sum and stop.",
});
session.on((ev) => {
const ts = new Date().toISOString().slice(11, 23);
if (ev.type === "assistant.turn_start") console.log(`[${ts}] turn_start`);
if (ev.type === "assistant.message") console.log(`[${ts}] msg ${ev.data?.content?.slice(0,60)}`);
if (ev.type === "assistant.turn_end") console.log(`[${ts}] turn_end`);
});
// Outer timeout because sendAndWait's own timeout is ignored when wedged
const timeoutPromise = new Promise((_, reject) =>
setTimeout(() => reject(new Error("HUNG")), 180_000));
try {
const result = await Promise.race([
session.sendAndWait({ prompt: `Fetch all ${N} shards and sum them.` }, 180_000),
timeoutPromise,
]);
console.log("OK:", result?.data?.content);
} catch (e) {
console.log("HUNG:", e.message);
}
await client.stop();
process.exit(0);
Save as loop.mjs, run MODEL=claude-sonnet-4.6 GITHUB_TOKEN=ghp_... node loop.mjs.
To sweep:
for MODEL in claude-sonnet-4.6 gpt-5.4; do
for PAYLOAD in 500 2000 5000; do
echo "==== $MODEL payload=${PAYLOAD}B ===="
MODEL=$MODEL PAYLOAD=$PAYLOAD node loop.mjs
done
done
Expected behavior
- The model either streams an
assistant.message / completes the turn, or
- The SDK surfaces an error event and the
sendAndWait promise rejects within
the timeoutMs parameter the caller passed.
Actual behavior
Neither happens. The SDK silently waits forever for an upstream stream that
never emits. The timeoutMs argument is not enforced when the wedge is
inside the provider stream.
What we ruled out
- Single large tool result. A one-shot test injecting 40 KB / 100 KB /
250 KB / 500 KB base64 tool returns into a fresh session completes
cleanly in 13–22 s on claude-opus-4.7. The hang requires a loop of
moderately-sized returns.
- Conversation history size alone. Resuming a 326-event / 1.8 MB
events.jsonl session with a tiny prompt completes in 4 s.
- Tool handler latency. Repro tool returns in <1 ms.
- Model family. GPT-5.4 wedges at the same symptom, just at higher
payload thresholds than Claude.
Likely culprits (from the outside)
- Provider streaming adapter not enforcing read deadlines on the upstream
SSE / chunked response.
- A specific tool-use repeat pattern that produces an "empty turn" the
CLI keeps waiting on.
- Cumulative tool-result payload pushing the request close to a
provider-side limit that returns no error and no terminator.
What would help us
- An enforced internal read deadline on the upstream model stream that
surfaces a session.error instead of waiting forever.
- The
timeoutMs parameter to sendAndWait actually unsticking the
request (it currently doesn't).
- A way to pass an
AbortSignal into sendAndWait so callers can cancel.
Workarounds we use
- Wrap every
sendAndWait in Promise.race(sendAndWait, externalTimeout)
so we at least know it hung.
- Forcibly kill the worker process to release the request and let our
durable-execution layer redeliver the turn.
- Spill large tool returns to blob storage and pass back a small pointer,
to keep cumulative tool-result payload below the wedge threshold.
Summary
Using
@github/copilot-sdkto drive a sequential tool-call loop, the SDKsilently hangs after a few tool iterations once tool result payloads grow
past a small threshold. The hang is wedged inside the model call:
assistant.turn_startfires, then no further events ever arrive —no
assistant.message, noassistant.turn_end, nosession.error.The
timeoutMsparameter tosession.sendAndWait(opts, timeoutMs)isignored while wedged. The only recovery is killing the process or
applying an outer
AbortController/Promise.racetimeout.This is reproducible with a ~120 line script using only
@github/copilot-sdk,no other dependencies, no dehydration / resume, no real tools.
Affected
@github/copilot-sdk(latest, as bundled with current Copilot CLI)claude-sonnet-4.6,claude-haiku-4.5,gpt-5.4claude-opus-4.7(older) and currentclaude-opus-4.6seriesv24.14.0on macOS 14, also reproduces in containerized LinuxClaude is more sensitive (smaller threshold) but GPT wedges identically
once payloads × turn count get larger. Not model-specific.
Threshold matrix (clean repro, single fresh session, sequential tool loop)
claude-sonnet-4.6gpt-5.4N=20shards, single tool, model is told to call shard=1..N then return the sum.Tool returns in <1 ms with a synthetic JSON
{ shard, value, padding: "x".repeat(...) }.When wedged:
[HANG] elapsed=473916mswas observed while still hung — i.e. theSDK held the request open ~8 minutes despite a 180 s
sendAndWaittimeoutparameter.
Symptom (event trace, anonymized)
Minimal reproduction (~50 lines, no extra deps)
Save as
loop.mjs, runMODEL=claude-sonnet-4.6 GITHUB_TOKEN=ghp_... node loop.mjs.To sweep:
Expected behavior
assistant.message/ completes the turn, orsendAndWaitpromise rejects withinthe
timeoutMsparameter the caller passed.Actual behavior
Neither happens. The SDK silently waits forever for an upstream stream that
never emits. The
timeoutMsargument is not enforced when the wedge isinside the provider stream.
What we ruled out
250 KB / 500 KB base64 tool returns into a fresh session completes
cleanly in 13–22 s on
claude-opus-4.7. The hang requires a loop ofmoderately-sized returns.
events.jsonlsession with a tiny prompt completes in 4 s.payload thresholds than Claude.
Likely culprits (from the outside)
SSE / chunked response.
CLI keeps waiting on.
provider-side limit that returns no error and no terminator.
What would help us
surfaces a
session.errorinstead of waiting forever.timeoutMsparameter tosendAndWaitactually unsticking therequest (it currently doesn't).
AbortSignalintosendAndWaitso callers can cancel.Workarounds we use
sendAndWaitinPromise.race(sendAndWait, externalTimeout)so we at least know it hung.
durable-execution layer redeliver the turn.
to keep cumulative tool-result payload below the wedge threshold.