fix(admin): use default HTTP port for by-host connections; don't crash tracing on empty trace#8121
Open
phacops wants to merge 3 commits into
Open
fix(admin): use default HTTP port for by-host connections; don't crash tracing on empty trace#8121phacops wants to merge 3 commits into
phacops wants to merge 3 commits into
Conversation
…n't crash tracing on empty trace output
Two snuba-admin bugs surface when the clickhouse-connect (HTTP) driver is
enabled (it is selected at runtime, behind the abstract ClickhousePool type):
1. By-host connections used cluster.get_http_port(). Admin always targets a
*specific* node by hostname, but the cluster's configured http_port belongs
to the cluster's query endpoint (which may sit behind a proxy/load
balancer), not to an arbitrary individual node. The native driver talks to
the node directly on its native port, so it was unaffected; the HTTP driver
needs the node's own ClickHouse HTTP listener, which is the well-known
default port. _build_validated_pool now builds the node with
DEFAULT_CLICKHOUSE_HTTP_PORT.
2. The tracing tool 500'd with {"message":"list index out of range"} on the
HTTP driver. clickhouse-connect does not surface the server's
send_logs_level output, so trace_output is always empty, and
summarize_trace_output indexed parsed[0] unconditionally. It now returns an
empty summary when there are no parseable trace lines.
Co-Authored-By: Claude Opus 4.8 <[email protected]>
Claude-Session: https://claude.ai/code/session_01Lzu1UmpvvA1S2Ypq2uqzEX
…le prefix match
The by-host port test cleared NODE_CONNECTIONS by matching keys starting with
"errors-", but the real keys are built from str(StorageKey), which falls back
to its repr ("StorageKey.ERRORS-..."), so the clear was a no-op and could leak
a mocked connection into other tests. Snapshot the cache, clear it, and restore
it afterward — independent of the key format, and guarantees the helper builds
a fresh node so the connection-cache call is actually exercised.
Co-Authored-By: Claude Opus 4.8 <[email protected]>
Claude-Session: https://claude.ai/code/session_01Lzu1UmpvvA1S2Ypq2uqzEX
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit a3bfc0e. Configure here.
The previous commit forced DEFAULT_CLICKHOUSE_HTTP_PORT for every connection built by _build_validated_pool, but get_ro_query_node_connection (used by the tracing, querylog and cardinality tools) reaches the cluster's configured query endpoint — the same host the normal read path reaches on cluster.get_http_port(). That endpoint may be a load balancer on a non-default HTTP port, so forcing 8123 there would break those HTTP admin flows. Only override to the default port when the target is NOT the query node (i.e. a specific individual node selected by host in the admin tools). The query-node path keeps cluster.get_http_port(), matching the read path. Adds test_query_node_connection_uses_cluster_http_port as the counterpart to test_by_host_connection_uses_default_http_port. Co-Authored-By: Claude Opus 4.8 <[email protected]> Claude-Session: https://claude.ai/code/session_01Lzu1UmpvvA1S2Ypq2uqzEX
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.

Two snuba-admin bugs that surface once the clickhouse-connect (HTTP) driver is enabled. The driver is chosen at runtime (
use_clickhouse_connect_driver) inside the shared connection cache, so admin paths get the HTTP pool transparently — and these two assumptions, which were fine for the native driver, break.1. By-host connections must use the node's default HTTP port, not the cluster's
http_port_build_validated_poolbuilt the node withhttp_port=cluster.get_http_port(). But admin always targets a specific node by hostname (e.g. a replica discovered fromsystem.clusters). The cluster's configuredhttp_portbelongs to the cluster's query endpoint — which may sit behind a proxy/load balancer — not to an arbitrary individual node.The node is now built with
DEFAULT_CLICKHOUSE_HTTP_PORT. This mirrors the documented "by-host helper" pattern already used by the CLI helpers and the connection cache's own default.2. The tracing tool 500'd with
list index out of rangeon the HTTP driverclickhouse-connect doesn't surface the server's
send_logs_leveloutput (a known, documented limitation of that path), sotrace_outputis always empty.summarize_trace_outputindexedparsed[0]unconditionally, raisingIndexError— which the trace view's catch-all turned into:It now returns an empty
TracingSummarywhen there are no parseable trace lines, so the tracing tool returns its (empty) trace instead of erroring.Changes
snuba/admin/clickhouse/common.py—_build_validated_poolbuilds the by-host node with the default ClickHouse HTTP port instead ofcluster.get_http_port().snuba/admin/clickhouse/trace_log_parsing.py—summarize_trace_outputreturns an empty summary on empty/unparseable input.Tests
test_by_host_connection_uses_default_http_port(parametrized over all four by-host helpers) asserts the node passed to the connection cache carries the default HTTP port and not the cluster's configured port (verified with a sentinel port so the guard holds even when the test cluster uses the default).test_tracing_summary_empty_trace_outputassertssummarize_trace_output("")(and whitespace) returns an empty summary rather than raising.The trace-parsing change was verified locally; the existing
test_tracing_summarystill passes.ruff checkandruff format --checkpass on all changed files.Legal Boilerplate
Look, I get it. The entity doing business as "Sentry" was incorporated in the State of Delaware in 2015 as Functional Software, Inc. and is gonna need some rights from me in order to utilize my contributions in this here PR. So here's the deal: I retain all rights, title and interest in and to my contributions, and by keeping this boilerplate intact I confirm that Sentry can use, modify, copy, and redistribute my contributions, under Sentry's choice of terms.
🤖 Generated with Claude Code
https://claude.ai/code/session_01Lzu1UmpvvA1S2Ypq2uqzEX
Generated by Claude Code