Skip to content

feat(endpointing): emit turn-outcome events for the endpointing eval (#505)#511

Merged
rosscado merged 1 commit into
mainfrom
feat/turn-outcome-events-505
Jul 5, 2026
Merged

feat(endpointing): emit turn-outcome events for the endpointing eval (#505)#511
rosscado merged 1 commit into
mainfrom
feat/turn-outcome-events-505

Conversation

@rosscado

@rosscado rosscado commented Jul 5, 2026

Copy link
Copy Markdown
Contributor

Closes #505.

What & why

Reports a text-free turn-outcome to the SayPi API (POST /turn-outcome) after every endpointing-driven auto-submit, so the server can score how good pFinishedSpeaking is at deciding when a user's turn ended. The only non-circular ground truth for "did the user actually finish?" is whether they started speaking again inside the resume window (submit → assistant response starts) — a resume means we cut them off. API side is saypi-api PR #310; design/rationale in PR #309.

Approach

The whole window already lives in ConversationMachine, so this rides existing transitions rather than inventing detection (no VAD/DOM/observer changes):

  • Window OPENconverting.submitting snapshots the correlation data (session id, last sequence number + its pFinishedSpeaking, speech-offset, app) into context. Only the endpointing auto-submit path reaches this state, so every emit is trigger:"auto" — manual sends bypass the machine and emit nothing.
  • Window CLOSE — the single transition that leaves responding.piThinking: piWriting/piSpeaking (response started), userInterrupting (user resumed first — a false finish), or the PI_THINKING_TIMEOUT_MS fallback (response never started). piThinking is entered/left once per turn, so exactly one event fires.
  • postTurnOutcome() — a thin, fire-and-forget POST in a new TurnOutcomeModule, reusing the shared callApi client (JWT/anon + CSP-safe background routing + 401-retry). Never awaited, never retried, never throws to the machine; a dropped event just leaves that turn to the server-side fallback. Payload is booleans, timestamps, a sequence number and an optional score — no transcript text.

Founder-approved v1 decisions

  • Resume detection: captured when interruptions are ON (the default). When OFF, a resume during piThinking isn't captured — an accepted optimism bias for v1.
  • last_speech_ended_at: uses timeUserStoppedSpeaking (same clock as submitted_at, so latency subtraction is skew-free).
  • Maintenance messages excluded — suppressed buffer flushes aren't real endpointing decisions.
  • last_sequence_number: highest sequence number assigned in the turn, paired with its pFinishedSpeaking from the same /transcribe response.
  • Window close = first response signal (piWriting/piSpeaking). With TTS on this closes at text-appear rather than the spec's preferred TTS-start; spec-fidelity is a documented follow-up (Endpointing eval: close the resume window at TTS-start, not first response signal (follow-up to #505) #510).

Correctness note found in review

Fixed a latent stale-snapshot re-emit: piThinking has a second, non-submit entry (the saypi:piThinking handler in listening, currently emitter-less but wired). Entering piThinking that way now clears pendingTurnOutcome, so a prior turn's snapshot can't be re-emitted as a spurious auto event. Covered by a regression test.

Known v1 limitations (non-blocking)

  • A VAD onset later cancelled as non-speech (userInterrupting → piSpeaking on hasNoAudio) still emits user_resumed:true — slight over-count, defensible under the spec's "VAD detect the user starting to speak" wording.
  • The resume wiring is proven at the unit layer by direct event injection; it rides the already-proven barge-in path (saypi:userSpeakinguserInterrupting during responding) rather than being re-verified end-to-end on a live host.

Tests

  • test/TurnOutcomeModule.spec.ts — POST body/URL, ISO timestamps, privacy (no text), skip-without-correlation-key.
  • test/state-machines/ConversationMachine-turnOutcome.spec.ts — response-started (piWriting & piSpeaking), resume/false-finish, timeout, maintenance exclusion, stale-snapshot regression.
  • Full suite green (tsc + Jest + Vitest): 207 files, 1804 passed, 1 skipped.

🤖 Generated with Claude Code

…505)

Report a text-free turn-outcome to the SayPi API after every endpointing-driven
auto-submit, so the server can score how good `pFinishedSpeaking` is at deciding
when a user's turn ended. The only non-circular ground truth for "did the user
actually finish?" is whether they started speaking again inside the resume
window (submit -> assistant response starts); a resume means we cut them off.

The whole window already lives in ConversationMachine, so this rides existing
transitions rather than inventing detection:

- Window OPEN: `converting.submitting` snapshots the correlation data
  (session id, last sequence number + its pFinishedSpeaking, speech-offset, app)
  into context. Only the endpointing auto-submit path reaches this state, so
  every emit is `trigger:"auto"` (manual sends bypass the machine entirely).
- Window CLOSE: the single transition that leaves `responding.piThinking` —
  piWriting/piSpeaking (response started), userInterrupting (user resumed first,
  a false finish), or the PI_THINKING_TIMEOUT_MS fallback (response never
  started). piThinking is entered/left once per turn, so exactly one event fires.
- Maintenance messages (suppressed buffer flushes) are excluded — they aren't
  real endpointing decisions.

The POST itself is a thin, fire-and-forget `postTurnOutcome()` in a new module,
reusing the shared `callApi` client (JWT/anon + CSP-safe background routing).
Never awaited, never retried, never throws to the UX; a dropped event just
leaves that turn to the server-side fallback. Payload is booleans, timestamps,
a sequence number and an optional score only — no transcript text.

See saypi-api PR #310 (endpoint) and PR #309 (design/rationale).

Co-Authored-By: Claude Opus 4.8 <[email protected]>
Claude-Session: https://claude.ai/code/session_013L2TkKx2c5z9gcuCRpMq5d
@rosscado rosscado merged commit 24af590 into main Jul 5, 2026
2 checks passed
@rosscado rosscado deleted the feat/turn-outcome-events-505 branch July 5, 2026 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Emit turn-outcome events for endpointing eval (POST /turn-outcome)

1 participant