Skip to content

feat: classify connection failures into actionable tool errors (#266)#339

Merged
tianzhou merged 9 commits into
mainfrom
feat/connection-error-classification-266
Jun 23, 2026
Merged

feat: classify connection failures into actionable tool errors (#266)#339
tianzhou merged 9 commits into
mainfrom
feat/connection-error-classification-266

Conversation

@tianzhou

Copy link
Copy Markdown
Member

Summary

Addresses the observability ask in #266 (bkurinsky's StrongDM/ephemeral-access case) — not by adding a status/probe tool, but by making the error a real query already produces actionable.

Today every failure — bad SQL, connection refused, auth expired, dead SSH tunnel — collapses into one generic EXECUTION_ERROR with the raw driver message, so an agent can't tell "fix your SQL" from "the source is down / refresh access". This classifies connection/access failures into three distinct codes:

  • SOURCE_UNREACHABLE — socket failures (ECONNREFUSED, ETIMEDOUT, ENOTFOUND, EHOSTUNREACH, ENETUNREACH, ECONNRESET)
  • AUTH_FAILED — per-driver auth signals (pg 28P01/28000; mysql/mariadb ER_ACCESS_DENIED_ERROR/1045/1698; sqlserver ELOGIN)
  • TUNNEL_FAILED — SSH tunnel establishment failures

Each returns a templated, source-named, actionable message (e.g. "Source 'staging' is unreachable (connection refused or timed out). Verify the database is running and reachable…") plus details: { source_id }. Anything unrecognized falls through to the existing EXECUTION_ERROR/SEARCH_ERROR path unchanged — no regression.

How it works

  • src/utils/error-classifier.ts — pure classifyConnectionError(error, connectorType, sourceId), detects on error.code/errno only (never message text); returns null for anything it doesn't recognize.
  • src/connectors/manager.tsconnectSource tags SSH-tunnel-establishment failures with a marker so they classify as TUNNEL_FAILED rather than a plain network error.
  • src/utils/tool-handler-helpers.tstryClassifyConnectionError helper centralizes the raw-source-id (config lookup) vs display-source-id (message) contract; resolves the connector type from the source config so it works even when the connect itself failed before a connector instance exists (the lazy/revoked-source case).
  • Wired into all three tool handlers: execute_sql, search_objects, and custom tools (the custom handler classifies before appending its SQL: … debug context, since a down source isn't a SQL problem).

Scope (intentionally minimal, per design doc)

No new MCP tool, no probe/status tool, no list_sources, no live reachability checking, no retry logic, no SQLite-specific handling. The companion context-reduction ask from #266 was handled separately in #338.

Test plan

  • pnpm test src/tools/__tests__ src/utils/__tests__ src/config — 702 unit tests pass
  • pnpm run build — clean
  • Unit coverage per category: network → SOURCE_UNREACHABLE, auth (code + errno) → AUTH_FAILED, tunnel marker precedence → TUNNEL_FAILED, unrecognized → null/generic fallback, single-source source_id: "default" contract, custom-tool classify-before-SQL-suffix
  • Integration tests (Testcontainers/Docker) not run in this environment

🤖 Generated with Claude Code

tianzhou and others added 7 commits June 24, 2026 00:59
)

Post-review hardening from the final code review:
- tryClassifyConnectionError wraps getSourceConfig so it never throws from
  within a caller's catch block (matches classifyConnectionError's totality).
- Add errno 1698 (ER_ACCESS_DENIED_NO_PASSWORD_ERROR) to the mysql/mariadb
  auth table — the most likely auth failure that previously slipped through
  to EXECUTION_ERROR.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
Copilot AI review requested due to automatic review settings June 23, 2026 17:30

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves DBHub tool error observability by classifying common connection/access failures into actionable, machine-readable tool error codes (SOURCE_UNREACHABLE, AUTH_FAILED, TUNNEL_FAILED) with templated, source-aware messages, while leaving all unrecognized errors on the existing generic paths.

Changes:

  • Added a pure classifyConnectionError() utility and a tryClassifyConnectionError() helper to map low-level connection/auth/tunnel failures into actionable tool errors.
  • Marked SSH tunnel establishment failures in ConnectorManager so they can be classified as TUNNEL_FAILED.
  • Wired classification into execute_sql, search_objects, and custom tool execution paths, with new unit tests covering the expected behaviors.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/utils/tool-handler-helpers.ts Adds tryClassifyConnectionError() to centralize source config lookup + classification + tool error formatting.
src/utils/error-classifier.ts Introduces error classification logic and error code/message generation.
src/utils/tests/error-classifier.test.ts Unit tests for the classifier’s network/auth/tunnel/null behaviors.
src/connectors/manager.ts Tags SSH tunnel establishment errors with a marker for TUNNEL_FAILED classification.
src/tools/execute-sql.ts Uses classification in the catch path before generic EXECUTION_ERROR.
src/tools/search-objects.ts Uses classification in the catch path before generic SEARCH_ERROR.
src/tools/custom-tool-handler.ts Classifies connection/access failures before augmenting errors with SQL/debug context.
src/tools/tests/execute-sql.test.ts Adds tests for SOURCE_UNREACHABLE, fallback behavior, and single-source "default" display id.
src/tools/tests/search-objects.test.ts Adds a test asserting AUTH_FAILED mapping for a connector login error.
src/tools/tests/custom-tool-handler.test.ts Adds a test ensuring connection failures return SOURCE_UNREACHABLE and are not SQL-augmented.

Comment thread src/utils/error-classifier.ts
tianzhou and others added 2 commits June 24, 2026 01:37
…#266)

The message hard-coded 'connection refused or timed out' but the classifier
also matches ENOTFOUND/EHOSTUNREACH/ENETUNREACH/ECONNRESET. Drop the
over-specific parenthetical; the remediation already names host/port/network.

Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
@tianzhou tianzhou merged commit 9f4bbca into main Jun 23, 2026
2 checks passed
@tianzhou tianzhou deleted the feat/connection-error-classification-266 branch June 23, 2026 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants