MCP remote client has no transport-level retry on socket/connection errors

## Problem

When a remote MCP server (type: "remote" with StreamableHTTPClientTransport) becomes temporarily unreachable — e.g. the server process restarts, the laptop suspends/resumes, or a TCP keep-alive goes stale — the MCP client has **no recovery mechanism**.

### What happens today

1. Server restarts → all in-memory sessions lost
2. Client sends a `tools/call` → `@modelcontextprotocol/sdk`'s internal reqwest pool has a dead keep-alive connection
3. **Socket-level error** occurs (not even an HTTP 404) — the request never reaches the server
4. `client.callTool()` throws an error
5. In `packages/opencode/src/mcp/index.ts`, the `convertMcpTool` `execute` function has **no catch/retry logic** — the error propagates to `Effect.catch` which logs it and returns undefined
6. The MCP server is marked as `"failed"` and **never reconnects**

### Why server-side middleware can't fix this

- **Socket errors never reach the server.** The client's HTTP library fails before sending a request.
- Even if the request does reach the server (e.g. HTTP 404 for stale session), the server can't tell the client's internal SDK state about a new session ID — that state lives inside `@modelcontextprotocol/sdk`'s transport layer.

### Expected behavior

When a tool call fails due to a transport error, the MCP client should:
1. Detect that the connection is dead (socket error, ECONNRESET, etc.)
2. Close the old transport/client
3. Create a new transport and reconnect (re-initialize)
4. Retry the original tool call with the new session

### Suggested fix

In `convertMcpTool` (`packages/opencode/src/mcp/index.ts`), wrap the `execute` function with transport-level retry:

```typescript
execute: async (args: unknown) => {
  try {
    return await client.callTool(
      { name: mcpTool.name, arguments: (args || {}) as Record<string, unknown> },
      CallToolResultSchema,
      { resetTimeoutOnProgress: true, timeout },
    )
  } catch (e) {
    // If this is a transport-level error, try reconnecting once
    if (isTransportError(e) && clientKey && mcpConfig) {
      log.warn("MCP transport error, attempting reconnect", { clientKey, error: e.message })
      try {
        await client.close()
        const result = await createAndStore(clientKey, { ...mcpConfig, enabled: true })
        if (result.status === "connected" && state.clients[clientKey]) {
          return await state.clients[clientKey].callTool(
            { name: mcpTool.name, arguments: (args || {}) as Record<string, unknown> },
            CallToolResultSchema,
            { resetTimeoutOnProgress: true, timeout },
          )
        }
      } catch (retryError) {
        log.error("MCP reconnect failed", { clientKey, error: retryError })
      }
    }
    throw e
  }
}
```

A simpler alternative: use the existing `connect` function to re-establish the connection, leveraging the transport fallback chain (StreamableHTTP → SSE).

### Environment

- OpenCode: latest (v1.x)
- MCP SDK: `@modelcontextprotocol/sdk`
- MCP Server: codesearch serve (streamable HTTP)
- OS: Windows (but issue applies to all platforms — any server restart triggers it)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCP remote client has no transport-level retry on socket/connection errors #25287

Problem

What happens today

Why server-side middleware can't fix this

Expected behavior

Suggested fix

Environment

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

MCP remote client has no transport-level retry on socket/connection errors #25287

Description

Problem

What happens today

Why server-side middleware can't fix this

Expected behavior

Suggested fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions