Skip to content

Close response body in SendEvent/HealthCheck to fix FD + memory leak (#15)#16

Open
michaeltoop wants to merge 1 commit into
swarmpit:masterfrom
michaeltoop:fix/sendevent-response-body-leak
Open

Close response body in SendEvent/HealthCheck to fix FD + memory leak (#15)#16
michaeltoop wants to merge 1 commit into
swarmpit:masterfrom
michaeltoop:fix/sendevent-response-body-leak

Conversation

@michaeltoop

Copy link
Copy Markdown

Fixes #15.

Problem

SendEvent (swarmpit/client.go) discards the http.Post response and never closes resp.Body:

_, err := http.Post(arg.EventEndpoint, "application/json; charset=utf-8", buffer)

An unclosed/undrained response body keeps the underlying TCP connection out of the keep-alive pool, so each call leaks a connection + file descriptor. SendEvent runs on every stats tick, so the agent's open-FD count and memory grow without bound.

In production (7-node Swarm, agent as a global service) this shows up as:

  • long-lived agents logging socket: too many open files — first on the event POST (Post http://swarmpit:8080/events), then collaterally on the docker.sock stats fetches — after which the agent stays alive but stops shipping events (UI graphs go blank);
  • busy-node agents climbing to their memory limit and being OOM-killed (exit 137, OOMKilled=true) on a roughly hourly cadence.

The ContainerUsage stats path already does defer resp.Body.Close(); only SendEvent was missing it.

Fix

Drain and close the body in SendEvent (and the matching http.Get in HealthCheck, which had the same omission):

resp, err := http.Post(...)
if err != nil {
    log.Printf("ERROR: Event sending failed: %s", err)
    return
}
io.Copy(ioutil.Discard, resp.Body)
resp.Body.Close()

ioutil.Discard is used to stay compatible with the module's go 1.12 (io.Discard landed in 1.16). Builds clean with go build ./swarmpit/.

🤖 Generated with Claude Code

SendEvent discarded the http.Post response (`_, err := ...`) and never
closed resp.Body. An unclosed/undrained body keeps the underlying TCP
connection out of the keep-alive pool, leaking a connection + file
descriptor on every call. SendEvent runs on every stats tick, so the
agent's open-FD count and memory grow unbounded: long-lived agents hit
"socket: too many open files" (event POSTs, then docker.sock stats
fetches) and busy agents are OOM-killed (exit 137).

Drain + close the body in SendEvent (and the matching http.Get in
HealthCheck). Fixes swarmpit#15.

Co-Authored-By: Claude Opus 4.8 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Agent leaks file descriptors + memory — SendEvent never closes the http.Post response body

1 participant