Add streaming TTS daemon mode (intercept Claude output via hooks)#1
Open
yncat wants to merge 1 commit into
Open
Add streaming TTS daemon mode (intercept Claude output via hooks)#1yncat wants to merge 1 commit into
yncat wants to merge 1 commit into
Conversation
Extend the one-shot notifier into a resident daemon that speaks Claude's actual output during an interactive session, instead of relying on the terminal (which a screen reader reads noisily as the spinner/stream redraws). - speech_queue: dedicated worker thread owning one ISpVoice, async speak (SPF_ASYNC), barge-in flush (SPF_PURGEBEFORESPEAK), UTF-8 -> UTF-16 fix - daemon: cpp-httplib server, POST /hook + GET /health, single-instance bind - event_dispatch: MessageDisplay (response + intermediate delta), PreToolUse (tool/command/file announcements), PostToolUse, Notification/Stop, UserPromptSubmit (flush); turn_id change also flushes - text_shaper: markdown/code-fence cleanup for speech - config: %APPDATA% config.json toggles (per-event, port, rate, markdown) - main: --daemon / --ensure-daemon (detached launch) / legacy stdin one-shot - tts_speaker: fix UTF-8 conversion so Japanese is spoken correctly - Makefile: /utf-8 + ws2_32.lib + new objects; vendored src/httplib.h - README (en/ja) + .claude.md: daemon mode, HTTP hooks, WSL networking Hooks are wired with type:"http" so no process is spawned per event, handling the high-frequency MessageDisplay stream smoothly. Each event is individually on/off via settings.json and config.json. Co-Authored-By: Claude Opus 4.8 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
背景・目的
現状の notifier は
Notification/Stopフックで「待機中」「停止」を読み上げるワンショット exe。しかし Claude がターミナルでスピナー/ストリーミング表示を再描画するたびにスクリーンリーダーが不要なものまで読んでしまう問題があった。claude -p(ヘッドレス)はセッション継続不可・課金別の懸念があるため、インタラクティブセッションのまま Claude の出力を横取りして SAPI に流す方式にする。調査(2026-06、公式 docs 確認)で以下が利用できることを確認:
MessageDisplayフック: 応答テキストと中間テキストをdeltaでストリーミング供給(turn_id/message_id/index/final)PreToolUse:tool_name/tool_input(コマンド・ファイルパス等)type:"http"フック: イベント JSON をローカル HTTP サーバーに POST(プロセス起動ゼロ)アプローチ
ワンショット exe を常駐デーモン + HTTP フックに拡張。
MessageDisplaydelta)PreToolUsePostToolUseNotification/StopUserPromptSubmit主な変更
新規:
src/speech_queue.*(専用ワーカースレッド+非同期SAPI+バージイン)、src/daemon.*(cpp-httplib)、src/event_dispatch.*、src/text_shaper.*、src/config.*、src/httplib.h(vendoring, MIT)改修:
src/main.cpp(--daemon/--ensure-daemon/従来ワンショット)、src/tts_speaker.cpp(UTF-8 変換バグ修正)、src/json_parser.*、Makefile(/utf-8+ws2_32.lib)、README(en/ja)・.claude.md設計上の決定(ユーザー確認済み)
displayContent不使用)/ 新ターン・新入力でフラッシュ検証方法(要 Windows)
nmake allclaudecode-notifier.exe --daemon→curl http://127.0.0.1:8765/healthがokPOST /hookにMessageDisplayを投げて発話確認、別turn_idでフラッシュ確認注意点
config.jsonのhostを0.0.0.0にしてフック URL を Windows ホスト IP に(README に手順)。Stopは MessageDisplay と重複しがち → 邪魔ならspeak_stop: false。🤖 Generated with Claude Code