feat(harness): pause and ask to continue at max_turns#304
Conversation
When an interactive top-level turn reaches max_turns, the harness now parks the turn and asks the user whether to continue instead of silently ending. Replying resumes the same turn with a fresh budget (max_turns += default_max_turns; turn_count stays monotonic). Sub-agents and harness::run turns keep auto-ending (no interactive user to answer). - Park interactive top-level turns at the cap, reusing the non-terminal AwaitingFunctions state plus a new awaiting_continue flag on TurnRecord; a new interactive flag distinguishes send (true) from run/spawn (false). - Resume on the next harness::send under the session lock: extend the budget, bump the step, re-enqueue. harness::stop enqueues a step so a parked turn observes the abort and finalizes cancelled. - Fix invisible notices: append_custom now populates the custom message display/content, so the console renders the prompt (it dropped entries with empty content). The prior max_turns notice never showed. - Add SessionStatus::Waiting and surface it in the console (status unions, sidebar dot, attention dock signal) so a parked turn reads distinctly from idle. - Gate behind config ask_to_continue_on_max_turns (default true). Claude-Session: https://claude.ai/code/session_018YytfzKvsyBiTHi6b2u1rn
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughA new ChangesWaiting Status for Interactive max_turns Pause
Sequence Diagram(s)sequenceDiagram
participant User
participant send_handle as send::handle
participant turn_loop as run_step
participant TurnStore
participant SessionManager
participant WebConsole
Note over turn_loop: Turn reaches max_turns (interactive + top-level)
turn_loop->>TurnStore: park_awaiting_continue (set awaiting_continue=true, AwaitingFunctions)
turn_loop->>SessionManager: append notice entry (max_turns_await_continue)
SessionManager-->>WebConsole: status-changed → waiting
WebConsole-->>User: StatusDot (accent) + dock attention signal
User->>send_handle: send new message (resume intent)
send_handle->>TurnStore: seed_or_merge (interactive=true)
TurnStore-->>send_handle: existing turn with awaiting_continue=true
send_handle->>TurnStore: acquire lock + re-read
send_handle->>TurnStore: resume_awaiting_continue (extend budget, set Running)
send_handle->>turn_loop: enqueue_step
turn_loop-->>SessionManager: status-changed → working
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
skill-check — worker0 verified, 24 skipped (no docs/).
Four for four. Nicely done. |
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@harness/src/functions/send.rs`:
- Around line 296-303: The issue is that the record state mutations
(awaiting_continue, status, step increment) are persisted via
crate::state::put_turn before the crate::turn_loop::enqueue_step call, so if
enqueue_step fails, the turn record is left in a persisted but inconsistent
state with no queued step to process. Reorder the operations so that
crate::turn_loop::enqueue_step is called before crate::state::put_turn, ensuring
that if the enqueue operation fails, the record state changes are never
persisted and the turn remains in a valid terminal state.
In `@harness/src/functions/stop.rs`:
- Around line 62-72: The issue is that the state changes (awaiting_continue =
false and status = Running) are persisted via put_turn before the enqueue_step
call, so if enqueue_step fails and the function is retried, the parked flag will
be false and the enqueue won't be retried, causing the turn to stall. Reorder
the code so that the put_turn call happens after the enqueue_step call succeeds.
This ensures that if enqueue_step fails, the state remains unchanged and the
retry will still see parked as true and attempt the enqueue again.
In `@session-manager/tests/features/status.feature`:
- Around line 146-153: The test scenario at lines 146-153 passes a reason field
with a non-error status (waiting) but does not verify that status_reason is
cleared/ignored in the response. After the session::get call, add an assertion
that meta.status_reason is absent or null to ensure the contract is protected
that status_reason should only be present for error statuses. Additionally,
optionally add an assertion on the delivery event to ui::status to verify that
status_reason is not included in the event payload either.
In `@session-manager/tests/golden/schemas/session.set-status.json`:
- Around line 8-25: The schema now includes "waiting" as a valid status option
in the enum (alongside idle, working, done, and error), but there is a stale
description somewhere around line 45 of the session.set-status.json file that
only documents the original four statuses. Update the description field to
include "waiting" in the list of valid target statuses so the documentation
accurately reflects all possible enum values defined in the schema.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: db5b192c-8a12-4d3e-96ff-2388ebbcfde4
📒 Files selected for processing (24)
console/web/src/components/sidebar/ConversationRow.tsxconsole/web/src/lib/chat-activity.tsconsole/web/src/lib/sessions/types.tsconsole/web/src/types/chat.tsharness/src/clients/session.rsharness/src/config.rsharness/src/functions/run.rsharness/src/functions/send.rsharness/src/functions/stop.rsharness/src/subagent.rsharness/src/turn_loop.rsharness/src/types/turn.rssession-manager/src/types.rssession-manager/tests/features/status.featuresession-manager/tests/golden/schemas/session.create.jsonsession-manager/tests/golden/schemas/session.ensure.jsonsession-manager/tests/golden/schemas/session.fork.jsonsession-manager/tests/golden/schemas/session.get.jsonsession-manager/tests/golden/schemas/session.list.jsonsession-manager/tests/golden/schemas/session.set-meta.jsonsession-manager/tests/golden/schemas/session.set-status.jsonsession-manager/tests/golden/schemas/session.store.get-meta.jsonsession-manager/tests/golden/schemas/session.store.list-metas.jsonsession-manager/tests/golden/schemas/session.store.put-meta.json
| record.awaiting_continue = false; | ||
| record.options.max_turns = extended_max_turns(record.options.max_turns, cfg.default_max_turns); | ||
| record.step += 1; | ||
| record.status = TurnStatus::Running; | ||
| record.updated_at = AgentMessage::now_ms(); | ||
| crate::state::put_turn(&deps.iii, record, cfg.session_timeout_ms).await?; | ||
| crate::turn_loop::enqueue_step(&deps.iii, &record.session_id, &record.turn_id, record.step) | ||
| .await?; |
There was a problem hiding this comment.
Rollback parked-state mutations if resume enqueue fails.
Line 301 commits awaiting_continue = false + status = Running + step/budget bump before Line 302 enqueue. If enqueue fails, the turn can get stuck non-terminal with no queued step to continue or cancel it.
Suggested fix
async fn resume_awaiting_continue(
deps: &Deps,
cfg: &WorkerConfig,
record: &mut TurnRecord,
) -> Result<(), HarnessError> {
+ let prev_step = record.step;
+ let prev_max_turns = record.options.max_turns;
record.awaiting_continue = false;
record.options.max_turns = extended_max_turns(record.options.max_turns, cfg.default_max_turns);
record.step += 1;
record.status = TurnStatus::Running;
record.updated_at = AgentMessage::now_ms();
crate::state::put_turn(&deps.iii, record, cfg.session_timeout_ms).await?;
- crate::turn_loop::enqueue_step(&deps.iii, &record.session_id, &record.turn_id, record.step)
- .await?;
+ if let Err(err) =
+ crate::turn_loop::enqueue_step(&deps.iii, &record.session_id, &record.turn_id, record.step)
+ .await
+ {
+ // Keep the turn resumable on retry if enqueue fails.
+ record.awaiting_continue = true;
+ record.options.max_turns = prev_max_turns;
+ record.step = prev_step;
+ record.status = TurnStatus::AwaitingFunctions;
+ record.updated_at = AgentMessage::now_ms();
+ let _ = crate::state::put_turn(&deps.iii, record, cfg.session_timeout_ms).await;
+ return Err(err);
+ }
Ok(())
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| record.awaiting_continue = false; | |
| record.options.max_turns = extended_max_turns(record.options.max_turns, cfg.default_max_turns); | |
| record.step += 1; | |
| record.status = TurnStatus::Running; | |
| record.updated_at = AgentMessage::now_ms(); | |
| crate::state::put_turn(&deps.iii, record, cfg.session_timeout_ms).await?; | |
| crate::turn_loop::enqueue_step(&deps.iii, &record.session_id, &record.turn_id, record.step) | |
| .await?; | |
| async fn resume_awaiting_continue( | |
| deps: &Deps, | |
| cfg: &WorkerConfig, | |
| record: &mut TurnRecord, | |
| ) -> Result<(), HarnessError> { | |
| let prev_step = record.step; | |
| let prev_max_turns = record.options.max_turns; | |
| record.awaiting_continue = false; | |
| record.options.max_turns = extended_max_turns(record.options.max_turns, cfg.default_max_turns); | |
| record.step += 1; | |
| record.status = TurnStatus::Running; | |
| record.updated_at = AgentMessage::now_ms(); | |
| crate::state::put_turn(&deps.iii, record, cfg.session_timeout_ms).await?; | |
| if let Err(err) = | |
| crate::turn_loop::enqueue_step(&deps.iii, &record.session_id, &record.turn_id, record.step) | |
| .await | |
| { | |
| // Keep the turn resumable on retry if enqueue fails. | |
| record.awaiting_continue = true; | |
| record.options.max_turns = prev_max_turns; | |
| record.step = prev_step; | |
| record.status = TurnStatus::AwaitingFunctions; | |
| record.updated_at = AgentMessage::now_ms(); | |
| let _ = crate::state::put_turn(&deps.iii, record, cfg.session_timeout_ms).await; | |
| return Err(err); | |
| } | |
| Ok(()) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@harness/src/functions/send.rs` around lines 296 - 303, The issue is that the
record state mutations (awaiting_continue, status, step increment) are persisted
via crate::state::put_turn before the crate::turn_loop::enqueue_step call, so if
enqueue_step fails, the turn record is left in a persisted but inconsistent
state with no queued step to process. Reorder the operations so that
crate::turn_loop::enqueue_step is called before crate::state::put_turn, ensuring
that if the enqueue operation fails, the record state changes are never
persisted and the turn remains in a valid terminal state.
| let parked = record.awaiting_continue; | ||
| if parked { | ||
| record.awaiting_continue = false; | ||
| record.step += 1; | ||
| record.status = crate::types::turn::TurnStatus::Running; | ||
| } | ||
| crate::state::put_turn(&deps.iii, &record, cfg.session_timeout_ms).await?; | ||
| if parked { | ||
| crate::turn_loop::enqueue_step(&deps.iii, &record.session_id, &record.turn_id, record.step) | ||
| .await?; | ||
| } |
There was a problem hiding this comment.
Preserve retryability when stop enqueue fails for parked turns.
After Line 68 persists awaiting_continue = false, an enqueue failure at Line 70 returns an error but leaves the turn in Running with no queued step. A retry then sees parked = false and won’t enqueue, so cancellation can stall indefinitely.
Suggested fix
let parked = record.awaiting_continue;
+ let prev_step = record.step;
if parked {
record.awaiting_continue = false;
record.step += 1;
record.status = crate::types::turn::TurnStatus::Running;
}
crate::state::put_turn(&deps.iii, &record, cfg.session_timeout_ms).await?;
if parked {
- crate::turn_loop::enqueue_step(&deps.iii, &record.session_id, &record.turn_id, record.step)
- .await?;
+ if let Err(err) =
+ crate::turn_loop::enqueue_step(&deps.iii, &record.session_id, &record.turn_id, record.step)
+ .await
+ {
+ // Restore parked state so retry can enqueue again.
+ record.awaiting_continue = true;
+ record.step = prev_step;
+ record.status = crate::types::turn::TurnStatus::AwaitingFunctions;
+ record.updated_at = crate::types::message::AgentMessage::now_ms();
+ let _ = crate::state::put_turn(&deps.iii, &record, cfg.session_timeout_ms).await;
+ return Err(err);
+ }
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let parked = record.awaiting_continue; | |
| if parked { | |
| record.awaiting_continue = false; | |
| record.step += 1; | |
| record.status = crate::types::turn::TurnStatus::Running; | |
| } | |
| crate::state::put_turn(&deps.iii, &record, cfg.session_timeout_ms).await?; | |
| if parked { | |
| crate::turn_loop::enqueue_step(&deps.iii, &record.session_id, &record.turn_id, record.step) | |
| .await?; | |
| } | |
| let parked = record.awaiting_continue; | |
| let prev_step = record.step; | |
| if parked { | |
| record.awaiting_continue = false; | |
| record.step += 1; | |
| record.status = crate::types::turn::TurnStatus::Running; | |
| } | |
| crate::state::put_turn(&deps.iii, &record, cfg.session_timeout_ms).await?; | |
| if parked { | |
| if let Err(err) = | |
| crate::turn_loop::enqueue_step(&deps.iii, &record.session_id, &record.turn_id, record.step) | |
| .await | |
| { | |
| // Restore parked state so retry can enqueue again. | |
| record.awaiting_continue = true; | |
| record.step = prev_step; | |
| record.status = crate::types::turn::TurnStatus::AwaitingFunctions; | |
| record.updated_at = crate::types::message::AgentMessage::now_ms(); | |
| let _ = crate::state::put_turn(&deps.iii, &record, cfg.session_timeout_ms).await; | |
| return Err(err); | |
| } | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@harness/src/functions/stop.rs` around lines 62 - 72, The issue is that the
state changes (awaiting_continue = false and status = Running) are persisted via
put_turn before the enqueue_step call, so if enqueue_step fails and the function
is retried, the parked flag will be false and the enqueue won't be retried,
causing the turn to stall. Reorder the code so that the put_turn call happens
after the enqueue_step call succeeds. This ensures that if enqueue_step fails,
the state remains unchanged and the retry will still see parked as true and
attempt the enqueue again.
| { "session_id": "s_001", "status": "waiting", "reason": "max_turns" } | ||
| """ | ||
| And I call "session::get" with: | ||
| """ | ||
| { "session_id": "s_001" } | ||
| """ | ||
| Then the response field "meta.status" is "waiting" | ||
| And delivery 1 to "ui::status" has "status" = "waiting" |
There was a problem hiding this comment.
Assert status_reason is cleared for waiting transitions.
Line 146 passes a reason with a non-error status, but this scenario doesn’t verify that status_reason is actually cleared/ignored. Please add an assertion after session::get that meta.status_reason is absent (and optionally the event payload too) so this contract stays protected.
Suggested test additions
Then the response field "meta.status" is "waiting"
+ And the response has no field "meta.status_reason"
And delivery 1 to "ui::status" has "status" = "waiting"
And delivery 1 to "ui::status" has "previous_status" = "working"
+ And delivery 1 to "ui::status" has no field "status_reason"🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@session-manager/tests/features/status.feature` around lines 146 - 153, The
test scenario at lines 146-153 passes a reason field with a non-error status
(waiting) but does not verify that status_reason is cleared/ignored in the
response. After the session::get call, add an assertion that meta.status_reason
is absent or null to ensure the contract is protected that status_reason should
only be present for error statuses. Additionally, optionally add an assertion on
the delivery event to ui::status to verify that status_reason is not included in
the event payload either.
| "oneOf": [ | ||
| { | ||
| "enum": [ | ||
| "idle", | ||
| "working", | ||
| "done", | ||
| "error" | ||
| ], | ||
| "type": "string" | ||
| }, | ||
| { | ||
| "description": "Parked mid-turn awaiting a user decision (e.g. the harness reached `max_turns` and is asking whether to continue).", | ||
| "enum": [ | ||
| "waiting" | ||
| ], | ||
| "type": "string" | ||
| } | ||
| ] |
There was a problem hiding this comment.
Update stale request status description to include waiting.
Line 45 still documents target statuses as idle / working / done / error, but the schema now accepts waiting too. Please align the description to the actual enum contract.
Suggested patch
- "description": "Target status: `idle` / `working` / `done` / `error`."
+ "description": "Target status: `idle` / `working` / `waiting` / `done` / `error`."🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@session-manager/tests/golden/schemas/session.set-status.json` around lines 8
- 25, The schema now includes "waiting" as a valid status option in the enum
(alongside idle, working, done, and error), but there is a stale description
somewhere around line 45 of the session.set-status.json file that only documents
the original four statuses. Update the description field to include "waiting" in
the list of valid target statuses so the documentation accurately reflects all
possible enum values defined in the schema.
Summary
When an interactive top-level turn hits
max_turns, the harness now pauses and asks the user whether to continue instead of silently ending. Replying resumes the same turn with a fresh budget (the turn count keeps climbing). Sub-agents andharness::runturns keep auto-ending — there is no interactive user to answer.This also fixes a latent bug: harness
noticeentries were written with emptycontent, so the console dropped them — the existing max_turns notice was invisible.Behavior
waitingstate.max_turns += default_max_turns.harness::stopcancels a parked turn cleanly.Changes
Harness
awaiting_continue+interactiveflags onTurnRecord(both#[serde(default)], back-compat safe — no public function-schema change).max_turnsguard parks interactive top-level turns (reusing the non-terminalAwaitingFunctionsstate) with a visible notice and awaitingsession status, instead of finalizing.harness::sendresumes a parked turn under the session lock: extends the budget, bumps the step, re-enqueues.harness::run/harness::spawnseedinteractive = false→ auto-end at the cap.harness::stopenqueues a step so a parked turn observesabort→ cancelled.append_customgains adisplayargument so the notice renders.ask_to_continue_on_max_turns(defaulttrue).Waiting status (session-manager + console)
SessionStatus::Waitingvariant (golden schemas regenerated).'waiting'added to the status unions, a sidebar status dot, and anattentiondock signal.Testing
cargo test— 90 passing.cargo test— 66 unit + 118 cucumber (incl. a newwaitingset-status scenario); goldens regenerated.tsc -btypecheck,biome check,vitest, and fullvite build— all green.https://claude.ai/code/session_018YytfzKvsyBiTHi6b2u1rn
Summary by CodeRabbit