Add context preflight and manual compaction#82
Conversation
📝 WalkthroughSummary by CodeRabbit
WalkthroughAdds configurable context token budgeting and preflight checks, implements session compaction planning and summarization (disabled tools), normalizes model-facing messages, persists model-facing metadata and compaction entries, integrates budgeting into runtime model flow, and provides terminal command/state/rendering for manual compaction plus tests and docs. ChangesContext Compaction System
Sequence Diagram(s) sequenceDiagram
participant App as Terminal (compact)
participant Runtime as Runtime.CompactSessionFrom
participant Sessions as SessionRepository
participant Planner as planCompaction
participant Client as CompletionClient
App->>Runtime: trigger CompactSessionFrom(sessionID, cwd, parent)
Runtime->>Sessions: load branch entries
Runtime->>Planner: compute compaction plan (summarize vs keep)
Runtime->>Client: request summary (DisableTools=true, system prompt, messages)
Client-->>Runtime: return summary text
Runtime->>Sessions: AppendCompaction(summary, firstKeptEntryID, tokensBefore)
Runtime->>App: return appended compaction entry
Estimated code review effort 🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Codecov Report❌ Patch coverage is Additional details and impacted files@@ Coverage Diff @@
## main #82 +/- ##
==========================================
+ Coverage 66.76% 67.44% +0.68%
==========================================
Files 207 211 +4
Lines 17869 18531 +662
==========================================
+ Hits 11930 12498 +568
- Misses 4841 4888 +47
- Partials 1098 1145 +47
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Harness. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
internal/terminal/prompt_queue.go (1)
27-27: ⚡ Quick winConsider using
busy()consistently across the codebase.This code now uses
app.busy()to guard queue processing, butinternal/terminal/message_layout.go:159explicitly checksapp.working || app.compacting. For better maintainability, consider using thebusy()abstraction consistently wherever both states need to be checked.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/terminal/prompt_queue.go` at line 27, Replace direct checks of the two state flags with the busy() abstraction: wherever you see code checking app.working || app.compacting (e.g. in the message layout rendering logic that references working and compacting), change it to call app.busy() instead so the busy logic is centralized and consistent with the queue guard that already uses app.busy(); verify busy() correctly returns (app.working || app.compacting) before making the replacements.internal/terminal/render_test.go (1)
301-314: ⚡ Quick winAssert the rendered compacting colors, not just the palette selector.
This test currently proves the compacting text and
workingShimmerPalette()output, but it would still pass if the render path kept applying the default shimmer palette. Please render the line into a frame and assert the spinner/content cell colors too.Suggested test tightening
func TestRenderCompactingIndicatorUsesCompactionTextAndPalette(t *testing.T) { t.Parallel() app := newRenderTestApp(t) app.compacting = true app.workFrame = 0 lines := app.renderWorkingIndicator(40) + require.Len(t, lines, 3) if !strings.HasPrefix(lines[1].Text, "⠋ Compacting context...") { t.Fatalf("compacting indicator text = %q", lines[1].Text) } - if got, want := app.workingShimmerPalette().bright, compactingShimmerBrightColor(); got != want { + palette := app.workingShimmerPalette() + if got, want := palette.bright, compactingShimmerBrightColor(); got != want { t.Fatalf("compacting shimmer bright = %v, want %v", got, want) } + app.frame = newCellBuffer(40, len(lines), tcell.StyleDefault) + app.writeStyledLine(1, 40, lines[1]) + if got, want := app.frame.cell(0, 1).Style.GetForeground(), palette.bright; got != want { + t.Fatalf("spinner foreground = %v, want %v", got, want) + } + if got, want := app.frame.cell(2, 1).Style.GetForeground(), workingShimmerColor(0, 0, len("Compacting context..."), palette); got != want { + t.Fatalf("label foreground = %v, want %v", got, want) + } }As per coding guidelines,
**/*_test.go: Prefer table-driven tests for core behavior and regression tests for terminal rendering bugs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/terminal/render_test.go` around lines 301 - 314, The test TestRenderCompactingIndicatorUsesCompactionTextAndPalette currently only checks the text and the palette selector; update it to render the produced line into a frame and assert the actual cell colors used by the spinner/content match the compacting palette values: use newRenderTestApp(...).renderWorkingIndicator(40) to get the lines, then render that line into a frame (via the same rendering helper used elsewhere in tests) and assert the spinner and content cell foreground (and background if applicable) equal compactingShimmerBrightColor() (and any other compacting shimmer values) rather than just comparing workingShimmerPalette(); keep the existing checks for text but add explicit assertions on the rendered cell Color/Attribute values to ensure the renderer applied the compacting palette.config.example.yaml (1)
48-54: ⚡ Quick winDocument
context.keep_recent_tokensin the sample config.
setDefaultsnow exposescontext.keep_recent_tokens, but the example config does not. That makes the retained-tail compaction budget effectively undiscoverable for users copying this file.Proposed update
context: preflight_enabled: true # 0 uses the selected model max_tokens, otherwise 16k capped at 20% of context. output_reserve_tokens: 0 provider_reserve_tokens: 2048 safety_margin_tokens: 8192 + keep_recent_tokens: 20000🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@config.example.yaml` around lines 48 - 54, Add a documented entry for context.keep_recent_tokens to the example config (under the existing context block) so users can see and copy the retained-tail compaction budget; the key name must match context.keep_recent_tokens and the value should reflect the default used by setDefaults (or explicitly state the numeric default if known), plus a brief comment explaining it controls the retained-tail compaction budget (how many recent tokens to always keep) and how it interacts with output_reserve_tokens/provider_reserve_tokens/safety_margin_tokens.internal/database/session_store.go (1)
237-247: ⚡ Quick winWrap the new entry-data failures with repository context.
These new decode/encode paths return raw errors, so callers lose whether the failure came from persisting
ModelFacingmetadata or rebuilding thinking-level state. This package already usesoops, so these branches should preserve that context too.Suggested fix
if options.modelFacing != nil { data, err := dataFromEntry(&entry) if err != nil { - return nil, err + return nil, oops.In("database").Code("decode_entry_data").Wrapf(err, "decode entry data") } data.ModelFacing = options.modelFacing dataJSON, err := dataJSONFromEntity(&data) if err != nil { - return nil, err + return nil, oops.In("database").Code("encode_entry_data").Wrapf(err, "encode entry data") } entry.DataJSON = normalizeDataJSON(dataJSON) }func applyThinkingLevelContext(contextEntity *SessionContextEntity, entry *EntryEntity) error { data, err := dataFromEntry(entry) if err != nil { - return err + return oops.In("database").Code("decode_entry_data").Wrapf(err, "decode thinking level entry data") } contextEntity.ThinkingLevel = data.ThinkingLevel return nil }Based on learnings: Applies to
**/*.go: Useoops.In("domain").Code("code").Wrapf(err, "message")for contextual errors in packages that already usesamber/oopsAlso applies to: 619-621
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/database/session_store.go` around lines 237 - 247, The error returns from dataFromEntry(&entry) and dataJSONFromEntity(&data) (and the similar spots around lines ~619-621) should be wrapped with samber/oops context so callers know the failure came from persisting ModelFacing vs rebuilding state; replace plain returns like `return nil, err` with oops wraps such as `return nil, oops.In("database").Code("decode_entry").Wrapf(err, "session_store: failed to decode entry for ModelFacing")` for dataFromEntry errors and `return nil, oops.In("database").Code("encode_entry").Wrapf(err, "session_store: failed to encode entry after setting ModelFacing")` for dataJSONFromEntity errors (consistent naming for code/domain is fine), and apply the same wrapping pattern to the analogous returns at the other referenced location (lines ~619-621) so all decode/encode failures include repository context.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/assistant/context_budget.go`:
- Around line 46-47: The code currently uses positiveOrDefault which treats 0 as
“unset” and re-applies defaults for ProviderReserve and SafetyMargin;
add/replace with a helper that honors explicit zero (e.g., nonNegativeOrDefault
or zeroAllowedOrDefault) that returns the policy value when it is >= 0 and only
returns the default when the policy value is negative/unset, then use that
helper for ProviderReserve and SafetyMargin (replace
positiveOrDefault(policy.ProviderReserveTokens, defaultContextProviderReserve)
and positiveOrDefault(policy.SafetyMarginTokens, defaultContextSafetyMargin) and
do the same replacement for the other occurrences around lines 135-140).
In `@internal/assistant/runtime_context.go`:
- Around line 36-39: modelFacingMessages is filtering out database.RoleCustom
(and other passthrough roles) because it calls isModelFacingRole which still
returns false for those roles; update the logic so model-facing
custom/passthrough roles are preserved: either modify isModelFacingRole to
return true for database.RoleCustom and the passthrough role constants or change
modelFacingMessages to treat database.RoleCustom (and the same passthrough roles
handled elsewhere) as model-facing (e.g., include a conditional that calls
modelFacingMessage(&message) when message.Role == database.RoleCustom). Ensure
the change is applied to all occurrences (the blocks around lines 36-39, 54-60,
66-74) so custom extension context is not dropped before request construction.
In `@internal/config/config.go`:
- Around line 67-74: The ContextConfig struct fields only have mapstructure tags
so JSON/YAML unmarshalling will use the Go field names; update ContextConfig by
adding json and yaml tags matching the existing snake_case mapstructure keys for
each field (e.g., add `json:"output_reserve_tokens"
yaml:"output_reserve_tokens"` for OutputReserveTokens,
`json:"provider_reserve_tokens" yaml:"provider_reserve_tokens"` for
ProviderReserveTokens, and similarly for SafetyMarginTokens, KeepRecentTokens,
and PreflightEnabled) so direct JSON/YAML loads and round-trips use the same
keys as mapstructure.
In `@internal/terminal/prompt_submit.go`:
- Around line 29-33: The prompt text is being recorded into history before
checking app.compacting which causes duplicates if submission is blocked; to
fix, check app.compacting first and bail out before any history recording so the
composer restoration doesn't create a duplicate entry—move the history-recording
statement(s) (the code that saves the prompt text) so they execute only after
the compacting check passes and a submission will actually be sent, leaving the
existing app.setComposerText and app.setStatus logic unchanged when returning
false.
---
Nitpick comments:
In `@config.example.yaml`:
- Around line 48-54: Add a documented entry for context.keep_recent_tokens to
the example config (under the existing context block) so users can see and copy
the retained-tail compaction budget; the key name must match
context.keep_recent_tokens and the value should reflect the default used by
setDefaults (or explicitly state the numeric default if known), plus a brief
comment explaining it controls the retained-tail compaction budget (how many
recent tokens to always keep) and how it interacts with
output_reserve_tokens/provider_reserve_tokens/safety_margin_tokens.
In `@internal/database/session_store.go`:
- Around line 237-247: The error returns from dataFromEntry(&entry) and
dataJSONFromEntity(&data) (and the similar spots around lines ~619-621) should
be wrapped with samber/oops context so callers know the failure came from
persisting ModelFacing vs rebuilding state; replace plain returns like `return
nil, err` with oops wraps such as `return nil,
oops.In("database").Code("decode_entry").Wrapf(err, "session_store: failed to
decode entry for ModelFacing")` for dataFromEntry errors and `return nil,
oops.In("database").Code("encode_entry").Wrapf(err, "session_store: failed to
encode entry after setting ModelFacing")` for dataJSONFromEntity errors
(consistent naming for code/domain is fine), and apply the same wrapping pattern
to the analogous returns at the other referenced location (lines ~619-621) so
all decode/encode failures include repository context.
In `@internal/terminal/prompt_queue.go`:
- Line 27: Replace direct checks of the two state flags with the busy()
abstraction: wherever you see code checking app.working || app.compacting (e.g.
in the message layout rendering logic that references working and compacting),
change it to call app.busy() instead so the busy logic is centralized and
consistent with the queue guard that already uses app.busy(); verify busy()
correctly returns (app.working || app.compacting) before making the
replacements.
In `@internal/terminal/render_test.go`:
- Around line 301-314: The test
TestRenderCompactingIndicatorUsesCompactionTextAndPalette currently only checks
the text and the palette selector; update it to render the produced line into a
frame and assert the actual cell colors used by the spinner/content match the
compacting palette values: use newRenderTestApp(...).renderWorkingIndicator(40)
to get the lines, then render that line into a frame (via the same rendering
helper used elsewhere in tests) and assert the spinner and content cell
foreground (and background if applicable) equal compactingShimmerBrightColor()
(and any other compacting shimmer values) rather than just comparing
workingShimmerPalette(); keep the existing checks for text but add explicit
assertions on the rendered cell Color/Attribute values to ensure the renderer
applied the compacting palette.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bede7f7e-6bc3-4e85-aec7-e58374542af3
📒 Files selected for processing (47)
config.example.yamldocs/context-management.mdinternal/assistant/anthropic.gointernal/assistant/anthropic_internal_test.gointernal/assistant/client.gointernal/assistant/context_budget.gointernal/assistant/context_budget_test.gointernal/assistant/context_build.gointernal/assistant/context_compaction.gointernal/assistant/context_compaction_test.gointernal/assistant/messages.gointernal/assistant/openai_chat.gointernal/assistant/provider_hooks_internal_test.gointernal/assistant/provider_payload_hook_internal_test.gointernal/assistant/retry.gointernal/assistant/runtime.gointernal/assistant/runtime_context.gointernal/assistant/runtime_entries.gointernal/assistant/runtime_model.gointernal/assistant/runtime_slash.gointernal/assistant/runtime_test.gointernal/assistant/tool_schema.gointernal/config/config.gointernal/config/loader.gointernal/config/loader_test.gointernal/database/entry_metadata.gointernal/database/session_repository_test.gointernal/database/session_store.gointernal/terminal/app.gointernal/terminal/async_events.gointernal/terminal/autocomplete.gointernal/terminal/commands.gointernal/terminal/compact_commands.gointernal/terminal/compact_commands_test.gointernal/terminal/extension_events.gointernal/terminal/input.gointernal/terminal/input_escape.gointernal/terminal/message_layout.gointernal/terminal/prompt_cancel.gointernal/terminal/prompt_queue.gointernal/terminal/prompt_send_test.gointernal/terminal/prompt_submit.gointernal/terminal/render.gointernal/terminal/render_lines.gointernal/terminal/render_parity_test.gointernal/terminal/render_test.gointernal/terminal/runtime_buffers.go
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
internal/terminal/prompt_send_test.go (1)
139-163: ⚡ Quick winConsider consolidating submit tests into a table-driven format.
The file now has four separate
TestSubmit*functions (lines 72–163) that each exercise different code paths throughsubmit(). Based on learnings, table-driven tests are preferred for core behavior. Consolidating these into a single table-driven test would improve maintainability and align with the guideline, though the current structure is consistent with the file's existing style.Example structure for consolidated table-driven test
func TestSubmit(t *testing.T) { t.Parallel() tests := []struct { name string composerText string setupApp func(*App) wantConsumed bool wantWorking bool wantMode appMode wantQueued []string wantComposer string wantHistory int wantRequest bool }{ { name: "ignores empty prompt", composerText: " ", setupApp: func(app *App) {}, wantConsumed: false, wantWorking: false, wantRequest: false, }, { name: "slash command opens panel", composerText: "/model", setupApp: func(app *App) {}, wantConsumed: false, wantMode: modePanel, wantRequest: false, }, { name: "queues when working", composerText: "queued follow-up", setupApp: func(app *App) { app.working = true }, wantConsumed: false, wantQueued: []string{"queued follow-up"}, wantRequest: false, }, { name: "restores prompt without history when compacting", composerText: "wait for compaction", setupApp: func(app *App) { app.compacting = true }, wantConsumed: false, wantComposer: "wait for compaction", wantHistory: 0, wantRequest: false, }, } for _, tc := range tests { t.Run(tc.name, func(t *testing.T) { t.Parallel() client := newTerminalPromptClient(newTerminalCompletionResult("ok"), nil) app := newPromptSendTestApp(t, client) tc.setupApp(app) app.setComposerText(tc.composerText) consumed, err := app.submit(context.Background()) if err != nil { t.Fatalf("submit error: %v", err) } if consumed != tc.wantConsumed { t.Errorf("consumed = %v, want %v", consumed, tc.wantConsumed) } // ... additional assertions }) } }Based on learnings: Prefer table-driven tests for core behavior.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/terminal/prompt_send_test.go` around lines 139 - 163, Consolidate the four TestSubmit* tests into a single table-driven TestSubmit that enumerates cases (empty prompt, slash command, queuing when app.working, and compacting restore) and for each case uses newTerminalPromptClient and newPromptSendTestApp, calls app.setComposerText, applies any setup via a setupApp(func(*App)) (e.g., set app.compacting or app.working or app.mode), then call app.submit(ctx) and assert consumed, app.composerText(), app.promptHistory length, queued prompts slice, and whether client.request is nil; run each case with t.Run and t.Parallel and keep existing helpers (submit(), newTerminalPromptClient, newPromptSendTestApp) and the compacting case that checks app.compacting and that no request was sent.internal/assistant/context_compaction_internal_test.go (1)
13-81: ⚡ Quick winPrefer table-driven coverage for
planCompactioncore scenarios.These two tests validate the same core planner surface with different fixtures; combining them into table-driven cases will reduce duplication and make boundary regressions easier to add.
Refactor sketch
-func TestPlanCompactionAfterPreviousCompactionUsesPreviousKeptBoundary(t *testing.T) { ... } -func TestPlanCompactionCutsAtTurnBoundaryWhenPossible(t *testing.T) { ... } +func TestPlanCompaction(t *testing.T) { + t.Parallel() + cases := []struct { + name string + entries []database.EntryEntity + keepRecentTokens int + assertPlan func(t *testing.T, plan compactionPlan) + }{ + // existing scenarios here + } + for _, tc := range cases { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + plan, err := planCompaction(tc.entries, tc.keepRecentTokens) + require.NoError(t, err) + tc.assertPlan(t, plan) + }) + } +}As per coding guidelines
**/*_test.go: Prefer table-driven tests for core behavior and regression tests for terminal rendering bugs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/assistant/context_compaction_internal_test.go` around lines 13 - 81, Combine the two nearly-identical tests into a table-driven test that iterates cases for planCompaction using compact fixtures created by compactionTestEntry; specifically, create a single TestPlanCompaction_TableDriven (or rename one of the existing tests) that defines cases (e.g. "uses previous kept boundary" and "cuts at turn boundary") each containing the input slice of database.EntryEntity, the target token budget (int), and expected fields (PreviousSummary, SummarizedEntryIDs, KeptEntryIDs, FirstKeptEntryID, Messages contents/length). For each case call planCompaction, assert no error, then compare the returned plan against the case expectations (use the same field assertions currently in TestPlanCompactionAfterPreviousCompactionUsesPreviousKeptBoundary and TestPlanCompactionCutsAtTurnBoundaryWhenPossible) to preserve behavior while removing duplication and making it easy to add more boundary/regression cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/assistant/context_budget_test.go`:
- Around line 102-104: The test currently checks numeric values from
usage.Breakdown which can be nil or missing and still return 0, producing false
positives; update the assertions to first verify the map contains the keys
before asserting their values. Specifically, in the test that uses the variable
usage and its map field Breakdown, add presence checks (e.g.,
require.Contains/require.True with a map lookup ok) for "reserve_provider" and
"reserve_safety" on usage.Breakdown, then assert their values (assert.Equal) as
before; ensure you still require.NoError(t, err) first and keep the same key
names.
In `@internal/assistant/context_compaction.go`:
- Around line 507-513: The token estimation uses the wrong prompt: replace uses
of compactionSummaryPrompt inside compactionRequestUsage with the actual dynamic
compactionSystemPrompt(plan.PreviousSummary) so the Usage field accurately
estimates tokens for the current SystemPrompt; update the call site where Usage
is set (the struct initializing SystemPrompt:
compactionSystemPrompt(plan.PreviousSummary), ThinkingLevel, CWD, Auth,
Messages, Usage: compactionRequestUsage(selectedModel, plan.Messages), Model:
*selectedModel) to pass compactionSystemPrompt(plan.PreviousSummary) (and make
the same replacement at the other compactionRequestUsage occurrence handling
update compactions) so preflight estimation uses the real system prompt.
---
Nitpick comments:
In `@internal/assistant/context_compaction_internal_test.go`:
- Around line 13-81: Combine the two nearly-identical tests into a table-driven
test that iterates cases for planCompaction using compact fixtures created by
compactionTestEntry; specifically, create a single
TestPlanCompaction_TableDriven (or rename one of the existing tests) that
defines cases (e.g. "uses previous kept boundary" and "cuts at turn boundary")
each containing the input slice of database.EntryEntity, the target token budget
(int), and expected fields (PreviousSummary, SummarizedEntryIDs, KeptEntryIDs,
FirstKeptEntryID, Messages contents/length). For each case call planCompaction,
assert no error, then compare the returned plan against the case expectations
(use the same field assertions currently in
TestPlanCompactionAfterPreviousCompactionUsesPreviousKeptBoundary and
TestPlanCompactionCutsAtTurnBoundaryWhenPossible) to preserve behavior while
removing duplication and making it easy to add more boundary/regression cases.
In `@internal/terminal/prompt_send_test.go`:
- Around line 139-163: Consolidate the four TestSubmit* tests into a single
table-driven TestSubmit that enumerates cases (empty prompt, slash command,
queuing when app.working, and compacting restore) and for each case uses
newTerminalPromptClient and newPromptSendTestApp, calls app.setComposerText,
applies any setup via a setupApp(func(*App)) (e.g., set app.compacting or
app.working or app.mode), then call app.submit(ctx) and assert consumed,
app.composerText(), app.promptHistory length, queued prompts slice, and whether
client.request is nil; run each case with t.Run and t.Parallel and keep existing
helpers (submit(), newTerminalPromptClient, newPromptSendTestApp) and the
compacting case that checks app.compacting and that no request was sent.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bc5b97bc-6e1b-4465-8cac-0d8031ed68e1
📒 Files selected for processing (19)
config.example.yamlinternal/assistant/anthropic.gointernal/assistant/context_budget.gointernal/assistant/context_budget_test.gointernal/assistant/context_compaction.gointernal/assistant/context_compaction_internal_test.gointernal/assistant/context_compaction_test.gointernal/assistant/messages.gointernal/assistant/openai_chat.gointernal/assistant/runtime_context.gointernal/assistant/runtime_context_internal_test.gointernal/config/config.gointernal/database/session_store.gointernal/terminal/compact_commands.gointernal/terminal/compact_commands_test.gointernal/terminal/message_layout.gointernal/terminal/prompt_send_test.gointernal/terminal/prompt_submit.gointernal/terminal/render_test.go
✅ Files skipped from review due to trivial changes (1)
- config.example.yaml
🚧 Files skipped from review as they are similar to previous changes (9)
- internal/terminal/message_layout.go
- internal/terminal/render_test.go
- internal/terminal/prompt_submit.go
- internal/terminal/compact_commands_test.go
- internal/assistant/openai_chat.go
- internal/config/config.go
- internal/assistant/messages.go
- internal/terminal/compact_commands.go
- internal/database/session_store.go
| require.NoError(t, err) | ||
| assert.Equal(t, 0, usage.Breakdown["reserve_provider"]) | ||
| assert.Equal(t, 0, usage.Breakdown["reserve_safety"]) |
There was a problem hiding this comment.
Prevent false-positive map assertions in this test.
Line 103 and Line 104 can pass even if usage.Breakdown is nil or keys are missing, because map lookups return 0 by default. Assert map/key presence first.
Proposed test hardening
require.NoError(t, err)
+ require.NotNil(t, usage.Breakdown)
+ require.Contains(t, usage.Breakdown, "reserve_provider")
+ require.Contains(t, usage.Breakdown, "reserve_safety")
assert.Equal(t, 0, usage.Breakdown["reserve_provider"])
assert.Equal(t, 0, usage.Breakdown["reserve_safety"])📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| require.NoError(t, err) | |
| assert.Equal(t, 0, usage.Breakdown["reserve_provider"]) | |
| assert.Equal(t, 0, usage.Breakdown["reserve_safety"]) | |
| require.NoError(t, err) | |
| require.NotNil(t, usage.Breakdown) | |
| require.Contains(t, usage.Breakdown, "reserve_provider") | |
| require.Contains(t, usage.Breakdown, "reserve_safety") | |
| assert.Equal(t, 0, usage.Breakdown["reserve_provider"]) | |
| assert.Equal(t, 0, usage.Breakdown["reserve_safety"]) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/assistant/context_budget_test.go` around lines 102 - 104, The test
currently checks numeric values from usage.Breakdown which can be nil or missing
and still return 0, producing false positives; update the assertions to first
verify the map contains the keys before asserting their values. Specifically, in
the test that uses the variable usage and its map field Breakdown, add presence
checks (e.g., require.Contains/require.True with a map lookup ok) for
"reserve_provider" and "reserve_safety" on usage.Breakdown, then assert their
values (assert.Equal) as before; ensure you still require.NoError(t, err) first
and keep the same key names.
| SystemPrompt: compactionSystemPrompt(plan.PreviousSummary), | ||
| ThinkingLevel: thinkingOff, | ||
| CWD: cwd, | ||
| Auth: auth, | ||
| Messages: plan.Messages, | ||
| Usage: compactionRequestUsage(selectedModel, plan.Messages), | ||
| Model: *selectedModel, |
There was a problem hiding this comment.
Use the actual compaction system prompt for token usage estimation.
SystemPrompt is dynamic (compactionSystemPrompt(plan.PreviousSummary)), but compactionRequestUsage still estimates with compactionSummaryPrompt. For update compactions, this underestimates context/input tokens and can let an oversized summarize request slip past preflight.
💡 Proposed fix
func (runtime *Runtime) summarizeCompaction(
ctx context.Context,
cwd string,
sessionID string,
selectedModel *model.Model,
auth model.RequestAuth,
plan *compactionPlan,
) (string, error) {
+ systemPrompt := compactionSystemPrompt(plan.PreviousSummary)
request := &CompletionRequest{
OnEvent: nil,
OnProviderObserve: runtime.emitProviderRequest,
OnProviderRequest: runtime.dispatchProviderRequestHook,
@@
- SystemPrompt: compactionSystemPrompt(plan.PreviousSummary),
+ SystemPrompt: systemPrompt,
ThinkingLevel: thinkingOff,
CWD: cwd,
Auth: auth,
Messages: plan.Messages,
- Usage: compactionRequestUsage(selectedModel, plan.Messages),
+ Usage: compactionRequestUsage(selectedModel, systemPrompt, plan.Messages),
Model: *selectedModel,
ProviderAttempt: 0,
}
@@
-func compactionRequestUsage(selectedModel *model.Model, messages []database.MessageEntity) model.TokenUsage {
+func compactionRequestUsage(
+ selectedModel *model.Model,
+ systemPrompt string,
+ messages []database.MessageEntity,
+) model.TokenUsage {
@@
- inputTokens := estimateInputTokens(compactionSummaryPrompt, messages)
+ inputTokens := estimateInputTokens(systemPrompt, messages)
@@
Breakdown: map[string]int{
- jsonSystemRole: estimateTokens(compactionSummaryPrompt),
+ jsonSystemRole: estimateTokens(systemPrompt),
contextBreakdownHistory: estimateMessageTokens(messages),
},Also applies to: 536-547
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/assistant/context_compaction.go` around lines 507 - 513, The token
estimation uses the wrong prompt: replace uses of compactionSummaryPrompt inside
compactionRequestUsage with the actual dynamic
compactionSystemPrompt(plan.PreviousSummary) so the Usage field accurately
estimates tokens for the current SystemPrompt; update the call site where Usage
is set (the struct initializing SystemPrompt:
compactionSystemPrompt(plan.PreviousSummary), ThinkingLevel, CWD, Auth,
Messages, Usage: compactionRequestUsage(selectedModel, plan.Messages), Model:
*selectedModel) to pass compactionSystemPrompt(plan.PreviousSummary) (and make
the same replacement at the other compactionRequestUsage occurrence handling
update compactions) so preflight estimation uses the real system prompt.
|
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
internal/terminal/compact_commands_test.go (1)
101-194: 🏗️ Heavy liftRefactor compact command state-transition tests into table-driven cases.
These added tests cover core behavior and share substantial setup/state assertions. A table-driven harness would reduce duplication and make future event variants easier to add consistently.
Based on learnings Applies to **/*_test.go : Prefer table-driven tests for core behavior and regression tests for terminal rendering bugs.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/terminal/compact_commands_test.go` around lines 101 - 194, The three tests (TestHandleCompactErrorUpdatesState, TestRunCompactSessionPostsError, TestCancelActiveOperationCancelsCompaction) duplicate setup/assert patterns for compaction state transitions; refactor them into a single table-driven test that iterates cases (e.g., "compact error updates state", "run compact posts provider error", "cancel active operation cancels compaction") and for each case sets up the app state (app.compacting, app.activeCompaction / activeCompactionState{Cancel:...}, session/client where needed), invokes the relevant method (app.handleCompactAsyncEvent, app.runCompactSession, app.cancelActiveOperation), and asserts the shared outcomes (app.compacting false, app.activeCompaction nil, transcript last message or statusMessage contains expected text); reference the existing symbols handleCompactAsyncEvent, runCompactSession, cancelActiveOperation, activeCompactionState, app.transcript, and statusMessage to locate code paths and replace the three tests with table entries that parameterize inputs, invocation, and expected post-conditions.internal/assistant/context_compaction_internal_test.go (1)
84-144: 🏗️ Heavy liftConvert these compaction planner scenarios to a single table-driven test.
These are core behavior permutations and currently repeat setup/assertion patterns. A table-driven form will make adding edge cases cheaper and keep assertions consistent.
Refactor sketch
+func TestPlanCompactionScenarios(t *testing.T) { + t.Parallel() + + tests := []struct { + name string + entries []database.EntryEntity + keep int + assert func(t *testing.T, plan compactionPlan, err error) + }{ + // previous-kept-boundary + // fallback-default-keep-recent + // reject-latest-compaction + } + + for _, tc := range tests { + tc := tc + t.Run(tc.name, func(t *testing.T) { + t.Parallel() + plan, err := planCompaction(tc.entries, tc.keep) + tc.assert(t, plan, err) + }) + } +}As per coding guidelines
**/*_test.go: Prefer table-driven tests for core behavior and regression tests for terminal rendering bugs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/assistant/context_compaction_internal_test.go` around lines 84 - 144, Combine the two planCompaction scenarios into one table-driven test (e.g., replace TestPlanCompactionFallsBackToDefaultKeepRecentTokens and TestPlanCompactionRejectsLatestCompaction with TestPlanCompaction_TableDriven) that defines cases with fields: name, entries (built via compactionTestEntry), tokenBudget, wantErr (bool), wantSummarizedNonEmpty (bool), wantKeptNonEmpty (bool), and expectedFirstKeptEntryID when relevant; iterate cases calling planCompaction(entries, tokenBudget) and assert according to the case (check err presence, plan.SummarizedEntryIDs, plan.KeptEntryIDs, and plan.FirstKeptEntryID for the latest-summary case where you must set latestSummary.CompactionFirstKeptEntryID). Leave compactionSystemPrompt test as a separate, small test (it already asserts compactionSystemPrompt contains "Update the existing compaction summary" and the previous summary text).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@internal/terminal/compact_commands_test.go`:
- Around line 156-159: The select waiting on client.ready in the test uses a
tight 1s timeout (case <-time.After(time.Second)) which causes flakiness on CI;
increase the async startup timeout to something more generous (e.g., 5s or 10s)
in that select so the case <-time.After(...) gives the provider more time before
calling t.Fatal("compaction should start provider request") — update the timeout
literal where the select checks client.ready to a larger duration.
---
Nitpick comments:
In `@internal/assistant/context_compaction_internal_test.go`:
- Around line 84-144: Combine the two planCompaction scenarios into one
table-driven test (e.g., replace
TestPlanCompactionFallsBackToDefaultKeepRecentTokens and
TestPlanCompactionRejectsLatestCompaction with TestPlanCompaction_TableDriven)
that defines cases with fields: name, entries (built via compactionTestEntry),
tokenBudget, wantErr (bool), wantSummarizedNonEmpty (bool), wantKeptNonEmpty
(bool), and expectedFirstKeptEntryID when relevant; iterate cases calling
planCompaction(entries, tokenBudget) and assert according to the case (check err
presence, plan.SummarizedEntryIDs, plan.KeptEntryIDs, and plan.FirstKeptEntryID
for the latest-summary case where you must set
latestSummary.CompactionFirstKeptEntryID). Leave compactionSystemPrompt test as
a separate, small test (it already asserts compactionSystemPrompt contains
"Update the existing compaction summary" and the previous summary text).
In `@internal/terminal/compact_commands_test.go`:
- Around line 101-194: The three tests (TestHandleCompactErrorUpdatesState,
TestRunCompactSessionPostsError, TestCancelActiveOperationCancelsCompaction)
duplicate setup/assert patterns for compaction state transitions; refactor them
into a single table-driven test that iterates cases (e.g., "compact error
updates state", "run compact posts provider error", "cancel active operation
cancels compaction") and for each case sets up the app state (app.compacting,
app.activeCompaction / activeCompactionState{Cancel:...}, session/client where
needed), invokes the relevant method (app.handleCompactAsyncEvent,
app.runCompactSession, app.cancelActiveOperation), and asserts the shared
outcomes (app.compacting false, app.activeCompaction nil, transcript last
message or statusMessage contains expected text); reference the existing symbols
handleCompactAsyncEvent, runCompactSession, cancelActiveOperation,
activeCompactionState, app.transcript, and statusMessage to locate code paths
and replace the three tests with table entries that parameterize inputs,
invocation, and expected post-conditions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3403c9fa-b397-478c-a825-6925f0f79419
📒 Files selected for processing (5)
internal/assistant/context_budget_test.gointernal/assistant/context_compaction_internal_test.gointernal/assistant/context_compaction_test.gointernal/assistant/runtime_context_internal_test.gointernal/terminal/compact_commands_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- internal/assistant/context_budget_test.go
| select { | ||
| case <-client.ready: | ||
| case <-time.After(time.Second): | ||
| t.Fatal("compaction should start provider request") |
There was a problem hiding this comment.
Increase async startup timeout to reduce test flakiness.
The 1s wait is tight for contended CI runners and can cause intermittent failures unrelated to correctness.
Low-risk stabilization diff
- case <-time.After(time.Second):
+ case <-time.After(3 * time.Second):
t.Fatal("compaction should start provider request")📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| select { | |
| case <-client.ready: | |
| case <-time.After(time.Second): | |
| t.Fatal("compaction should start provider request") | |
| select { | |
| case <-client.ready: | |
| case <-time.After(3 * time.Second): | |
| t.Fatal("compaction should start provider request") |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@internal/terminal/compact_commands_test.go` around lines 156 - 159, The
select waiting on client.ready in the test uses a tight 1s timeout (case
<-time.After(time.Second)) which causes flakiness on CI; increase the async
startup timeout to something more generous (e.g., 5s or 10s) in that select so
the case <-time.After(...) gives the provider more time before calling
t.Fatal("compaction should start provider request") — update the timeout literal
where the select checks client.ready to a larger duration.



Summary
/compactwith async terminal spinner/cancellation handlingValidation
mise exec -- task ciNotes
docs/refactoring/*.mdfiles uncommitted