Skip to content

Mitigate memory spikes from event backlog and shell output#24525

Closed
zjy-dev wants to merge 1 commit intoanomalyco:devfrom
zjy-dev:fix/memory-backlog-and-shell-output
Closed

Mitigate memory spikes from event backlog and shell output#24525
zjy-dev wants to merge 1 commit intoanomalyco:devfrom
zjy-dev:fix/memory-backlog-and-shell-output

Conversation

@zjy-dev
Copy link
Copy Markdown

@zjy-dev zjy-dev commented Apr 26, 2026

Summary

  • cap SSE and wildcard event backlogs so slow subscribers cannot retain unbounded memory
  • throttle duplicate and high-frequency tool metadata updates during streaming execution
  • spill large shell command output to disk and keep only a bounded in-memory preview

Problem

We have reports of OpenCode processes growing to several GB of RSS and getting OOM-killed during longer sessions.

The main amplification paths identified here were:

  • unbounded backlogs for slow event consumers
  • repeated streaming updates carrying large accumulated tool output
  • shell command output accumulating fully in memory

Changes

Event backlog protection

  • add bounded AsyncQueue support
  • cap both instance and global SSE queues at 256
  • disconnect slow SSE consumers when the queue overflows
  • change the wildcard instance bus from PubSub.unbounded to PubSub.sliding(1024)

Tool metadata reduction

  • add throttling and deduplication for ctx.metadata() updates in the shared tool execution path
  • avoid resetting the running tool start time on metadata refreshes
  • make metadata fingerprinting tolerant of non-serializable metadata

Shell output memory reduction

  • shell streaming updates now publish only a bounded preview
  • shell streaming updates are throttled
  • large shell output now spills to the truncation directory once it exceeds configured limits
  • after spill, only a bounded tail window remains in memory
  • completed shell tool output now references the saved file and includes a bounded preview instead of retaining the full output in memory

Validation

  • ran the package typecheck entrypoint locally; the repo still has unrelated pre-existing type errors outside this change set
  • verified bounded AsyncQueue behavior with runtime scripts
  • verified PubSub.sliding semantics with a runtime script ([2,3,4] retained from publishing 1,2,3,4 into capacity 3)
  • compared bounded vs unbounded queue memory usage with synthetic stress tests
  • simulated slow-consumer event backlog behavior and confirmed queue growth stays bounded
  • simulated large shell output and confirmed memory stays bounded after spill-to-disk

Notes

  • typed bus channels were intentionally left unchanged in this PR to avoid altering stricter event-delivery semantics
  • this PR focuses on bounding memory growth in high-volume observational and event-streaming paths first

@github-actions
Copy link
Copy Markdown
Contributor

Hey! Your PR title Mitigate memory spikes from event backlog and shell output doesn't follow conventional commit format.

Please update it to start with one of:

  • feat: or feat(scope): new feature
  • fix: or fix(scope): bug fix
  • docs: or docs(scope): documentation changes
  • chore: or chore(scope): maintenance tasks
  • refactor: or refactor(scope): code refactoring
  • test: or test(scope): adding or updating tests

Where scope is the package name (e.g., app, desktop, opencode).

See CONTRIBUTING.md for details.

@github-actions github-actions Bot added the needs:compliance This means the issue will auto-close after 2 hours. label Apr 26, 2026
@github-actions
Copy link
Copy Markdown
Contributor

This PR doesn't fully meet our contributing guidelines and PR template.

What needs to be fixed:

  • PR description is missing required template sections. Please use the PR template.

Please edit this PR description to address the above within 2 hours, or it will be automatically closed.

If you believe this was flagged incorrectly, please let a maintainer know.

@github-actions
Copy link
Copy Markdown
Contributor

The following comment was made by an LLM, it may be inaccurate:

Found a potentially related PR:

#16346: fix(opencode): unbounded memory growth during active usage
#16346

This PR also addresses unbounded memory growth issues, which is the core problem that PR #24525 is solving. While the older PR may have tackled different aspects of memory management, it's worth checking if #16346 already addresses some of the same memory issues or if there's overlap in the solutions (event backlogs, shell output, or metadata updates).

@github-actions
Copy link
Copy Markdown
Contributor

This pull request has been automatically closed because it was not updated to meet our contributing guidelines within the 2-hour window.

Feel free to open a new pull request that follows our guidelines.

@github-actions github-actions Bot removed the needs:compliance This means the issue will auto-close after 2 hours. label Apr 26, 2026
@github-actions github-actions Bot closed this Apr 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant