Skip to content

fix(backend): drain Maestro REST client connection pool on teardown#5673

Draft
rhamitarora wants to merge 2 commits into
Azure:mainfrom
rhamitarora:rhamitarora/http-client-closed
Draft

fix(backend): drain Maestro REST client connection pool on teardown#5673
rhamitarora wants to merge 2 commits into
Azure:mainfrom
rhamitarora:rhamitarora/http-client-closed

Conversation

@rhamitarora

Copy link
Copy Markdown
Contributor

newRESTClient clones http.DefaultTransport, creating a per-client connection pool. The orphan/cleanup controllers rebuild Maestro clients every sync cycle and tore them down via context cancellation, which only stopped the gRPC client and never drained the REST transport's idle connections. This leaked TCP connections and file descriptors over time.

Return the underlying *http.Client from newRESTClient and register CloseIdleConnections via context.AfterFunc so the existing cancellation-based teardown also drains the pool.

What

Why

Testing

Special notes for your reviewer

PR Checklist

  • PR is scoped to a single task (no mixed concerns)
  • Title follows Conventional Commits format
  • Summary explains the "Why" behind the change
  • Linked to relevant ticket/issue
  • Screenshots included (if graph/UI/metrics changes)
  • Self-reviewed the diff
  • CI/CD checks are passing (ignore Tide)
  • Draft PR used for WIP (if applicable)
  • Commit history is clean (rebased/squashed)
  • Tricky code blocks are commented
  • Specific reviewers tagged
  • All comment threads resolved before merge

newRESTClient clones http.DefaultTransport, creating a per-client
connection pool. The orphan/cleanup controllers rebuild Maestro clients
every sync cycle and tore them down via context cancellation, which
only stopped the gRPC client and never drained the REST transport's
idle connections. This leaked TCP connections and file descriptors over
time.

Return the underlying *http.Client from newRESTClient and register
CloseIdleConnections via context.AfterFunc so the existing
cancellation-based teardown also drains the pool.
@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown

Hi @rhamitarora. Thanks for your PR.

I'm waiting for a Azure member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

SubscriptionCollector.Run created a time.Ticker but never stopped it.
The loop exits on ctx.Done(), leaving the ticker dangling. Add
defer t.Stop() to release it on shutdown, matching the pattern used
elsewhere in the codebase.
@openshift-ci

openshift-ci Bot commented Jun 16, 2026

Copy link
Copy Markdown

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: rhamitarora
Once this PR has been reviewed and has the lgtm label, please assign mbarnes for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rhamitarora rhamitarora marked this pull request as ready for review June 17, 2026 05:45
Copilot AI review requested due to automatic review settings June 17, 2026 05:45
@openshift-ci openshift-ci Bot requested review from geoberle and janboll June 17, 2026 05:45

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Note

Copilot was unable to run its full agentic suite in this review.

This PR focuses on preventing resource leaks by ensuring long-lived timers and HTTP connection pools are properly cleaned up when no longer needed.

Changes:

  • Stop the time.Ticker in SubscriptionCollector.Run to avoid ticker leaks.
  • Update Maestro REST client construction to expose the underlying *http.Client.
  • Drain the REST client’s idle HTTP connections on context cancellation to avoid leaking TCP connections/FDs across repeated client creation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
frontend/pkg/metrics/metrics.go Stops the ticker to prevent resource leakage in a long-running loop.
backend/pkg/maestro/maestro_client.go Returns underlying HTTP client and closes idle connections on ctx cancellation to prevent connection/FD leaks.

Comment on lines +74 to +79
restClient, restHTTPClient := newRESTClient(maestroRESTAPIEndpoint)
// The REST client uses a cloned http.Transport, which owns its own
// connection pool. Callers tear the client down by cancelling ctx, so we
// drain the pool's idle connections on cancellation to avoid leaking TCP
// connections and file descriptors across repeated client creations.
context.AfterFunc(ctx, restHTTPClient.CloseIdleConnections)
// connection pool. Callers tear the client down by cancelling ctx, so we
// drain the pool's idle connections on cancellation to avoid leaking TCP
// connections and file descriptors across repeated client creations.
context.AfterFunc(ctx, restHTTPClient.CloseIdleConnections)
@rhamitarora rhamitarora requested a review from Copilot June 17, 2026 06:02

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (1)

backend/pkg/maestro/maestro_client.go:84

  • On the error path, the REST http.Client has already been created (and the AfterFunc registered), but the function returns immediately without actively draining idle connections. If callers don’t cancel ctx on NewClient failure, this can keep idle conns/Fds around longer than intended. Consider calling restHTTPClient.CloseIdleConnections() before returning the error (in addition to the ctx-cancellation hook).
	grpcClient, err := newGRPCSourceWorkClient(ctx, maestroGRPCAPIEndpoint, restClient, maestroSourceID)
	if err != nil {
		return nil, utils.TrackError(fmt.Errorf("failed to create maestro grpc source work client: %w", err))
	}

// connection pool. Callers tear the client down by cancelling ctx, so we
// drain the pool's idle connections on cancellation to avoid leaking TCP
// connections and file descriptors across repeated client creations.
context.AfterFunc(ctx, restHTTPClient.CloseIdleConnections)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for flagging this. context.AfterFunc was added in Go 1.21, and this module targets Go 1.25.7 (see go 1.25.7 in backend/go.mod), so it's well within the supported toolchain and compiles fine — no compile-time break. The repo doesn't support Go < 1.21, so the goroutine fallback isn't needed here. Keeping context.AfterFunc as the idiomatic approach.

@rhamitarora rhamitarora requested a review from Copilot June 17, 2026 06:19

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

backend/pkg/maestro/maestro_client.go:84

  • If newGRPCSourceWorkClient(...) returns an error, NewClient returns immediately, but the restHTTPClient idle pool will only be drained when ctx is canceled. If any REST calls were made during GRPC client setup (or if this pattern is copied elsewhere), this can leave idle connections open longer than necessary. Consider capturing the stop func returned by context.AfterFunc and, on the error path, stopping the after-func and explicitly calling restHTTPClient.CloseIdleConnections() before returning.
	restClient, restHTTPClient := newRESTClient(maestroRESTAPIEndpoint)
	// The REST client uses a cloned http.Transport, which owns its own
	// connection pool. Callers tear the client down by cancelling ctx, so we
	// drain the pool's idle connections on cancellation to avoid leaking TCP
	// connections and file descriptors across repeated client creations.
	context.AfterFunc(ctx, restHTTPClient.CloseIdleConnections)

	grpcClient, err := newGRPCSourceWorkClient(ctx, maestroGRPCAPIEndpoint, restClient, maestroSourceID)
	if err != nil {
		return nil, utils.TrackError(fmt.Errorf("failed to create maestro grpc source work client: %w", err))
	}

Comment thread backend/pkg/maestro/maestro_client.go
Comment thread backend/pkg/maestro/maestro_client.go
@rhamitarora rhamitarora marked this pull request as draft June 17, 2026 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants