Skip to content

Bump kelos-workers and kelos-pr-responder task memory to 1Gi#1163

Open
gjkim42 wants to merge 1 commit into
mainfrom
bump-task-memory-1gi
Open

Bump kelos-workers and kelos-pr-responder task memory to 1Gi#1163
gjkim42 wants to merge 1 commit into
mainfrom
bump-task-memory-1gi

Conversation

@gjkim42
Copy link
Copy Markdown
Collaborator

@gjkim42 gjkim42 commented May 19, 2026

What type of PR is this?

/kind cleanup

What this PR does / why we need it:

Bumps the claude-code container memory request from 512Mi to 1Gi for the kelos-workers and kelos-pr-responder TaskSpawners.

Multiple recent kelos-workers-issue-comment-* tasks failed with pod evictions on the node. Inspecting the evicted pods showed claude-code routinely using ~1.2–1.3 GiB of memory while only requesting 512 MiB:

The node was low on resource: memory. Threshold quantity: 100Mi, available: 75424Ki.
Container claude-code was using 1360064Ki, request is 512Mi, has larger consumption of memory.

Because the pods are Burstable QoS, they're first in line for kubelet eviction under node memory pressure. Bumping the request to 1 GiB brings it closer to actual usage and reduces the eviction risk. Scope is limited to the two spawners running the heaviest claude-code workloads — the other spawners can be revisited if they start showing the same failure mode.

Which issue(s) this PR is related to:

N/A

Special notes for your reviewer:

Only requests.memory is bumped; no limit is set on memory (existing behavior preserved). ephemeral-storage and cpu are unchanged.

Does this PR introduce a user-facing change?

NONE

Summary by cubic

Increase claude-code memory request from 512Mi to 1Gi for kelos-workers and kelos-pr-responder to prevent pod evictions under memory pressure. Only requests.memory changed; CPU and ephemeral-storage are unchanged, and there is still no memory limit.

Written for commit 9ae35ec. Summary will update on new commits. Review in cubic

Recent kelos-workers tasks were evicted under node memory pressure
because the claude-code container regularly uses ~1.2-1.3 GiB while
only requesting 512 MiB. Bumping the request to 1 GiB reduces the
QoS-eviction risk for the two spawners that run the heaviest
claude-code workloads.
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issues found across 2 files

Re-trigger cubic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant