Skip to content

502 provider_unavailable errors from OpenRouter are not retried, causing subagent/session aborts #22448

@tim-mohrbach-ikigai

Description

@tim-mohrbach-ikigai

Description

When OpenRouter returns a 502 error (e.g., Network connection lost, provider_unavailable), OpenCode does not retry the request. Instead, the message is marked as errored and the entire step — including any in-progress tool calls — is aborted with Tool execution aborted / interrupted: true.

This is particularly destructive for subagent sessions: the subagent dies completely, all work in that step is lost, and the parent session may also lose its connection waiting for the task result.

What I've Noticed

The 502 error arrives as an UnknownError (not APIError) with a JSON body:

{"code":502,"message":"Network connection lost.","metadata":{"error_type":"provider_unavailable"}}

In packages/opencode/src/session/retry.ts, the retryable() function has logic to handle this:

// Line ~62
if (code.includes("exhausted") || code.includes("unavailable")) {
  return "Provider is overloaded"
}

However, code is extracted as:

const code = typeof json.code === "string" ? json.code : ""

The issue: json.code is 502 (a number), not a string. So typeof json.code === "string" is false, and code becomes "". The code.includes("unavailable") check never matches.

Additionally, the string "unavailable" appears in json.metadata.error_type ("provider_unavailable"), which is never inspected.

Evidence

From my session database, I've logged 19 instances of 502 Network connection lost errors across April 9-14, affecting both anthropic/claude-sonnet-4.6 (16 times) and z-ai/glm-5.1 (3 times). None were retried.

Example error from the DB:

{
  "name": "UnknownError",
  "data": {
    "message": "{\"code\":502,\"message\":\"Network connection lost.\",\"metadata\":{\"error_type\":\"provider_unavailable\"}}"
  }
}

A second variant exists:

{
  "name": "UnknownError",
  "data": {
    "message": "{\"code\":502,\"message\":\"JSON error injected into SSE stream\",\"metadata\":{\"error_type\":\"provider_unavailable\"}}"
  }
}

Proposed Change

In packages/opencode/src/session/retry.ts, around line 62, the JSON code extraction could be updated to handle numeric codes and also check metadata.error_type:

// Current:
const code = typeof json.code === "string" ? json.code : ""

// Proposed:
const code = typeof json.code === "string"
  ? json.code
  : typeof json.code === "number"
    ? String(json.code)
    : ""
const errorType = typeof json.metadata?.error_type === "string" ? json.metadata.error_type : ""

And the retryable check could be expanded:

// Current:
if (code.includes("exhausted") || code.includes("unavailable")) {
  return "Provider is overloaded"
}

// Proposed:
if (code.includes("exhausted") || code.includes("unavailable") || errorType.includes("unavailable")) {
  return "Provider is overloaded"
}
if (["502", "503", "524"].includes(code)) {
  return "Provider temporarily unavailable"
}

Impact

  • 502s are transient — OpenRouter typically recovers within seconds
  • Subagent sessions are killed entirely, wasting all accumulated context and tokens
  • The parent session may also abort, compounding the waste
  • A simple retry with backoff would recover most of these failures automatically

Related Issues

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions