quicker-voice-runtime — local ASR server for QuickerAgent

Fork-friendly Python runtime implementing the quicker-voice-v1 protocol (docs/voice-input-plugin.md).

Inspired by CapsWriter-Offline (C/S + offline ASR), but uses a simpler JSON+binary PCM WebSocket contract for QuickerAgent Composer integration.

Quick start

cd voice-asr-runtime
uv sync
uv run download-asr-model   # first time: ~160 MB SenseVoice int8 (ITN/punctuation)
uv run quicker-voice-runtime

HTTP health: http://127.0.0.1:6016/health
WebSocket: ws://127.0.0.1:6016 (subprotocol quicker-voice-v1)

From agent-gui:

pnpm voice:dev-server

Then in QuickerAgent settings → disable mock, hold the Composer microphone.

Backends

Backend	When	Output
stub	No model files	`[stub] 收到约 Xs 音频` — protocol/UI testing
sherpa-onnx	`models/sensevoice/` + `uv sync`	Real offline ASR (SenseVoice, ITN/punctuation)
sherpa-onnx (fallback)	`models/paraformer-zh/`	Paraformer zh-small, no auto punctuation

See models/README.md for model layout.

Environment

Variable	Default	Description
`QUICKER_VOICE_HOST`	`127.0.0.1`	Bind address
`QUICKER_VOICE_PORT`	`6016`	HTTP + WS port
`QUICKER_VOICE_MODEL_DIR`	auto	Sherpa model directory
`QUICKER_VOICE_MODEL_TYPE`	auto	`sensevoice` / `paraformer` / `whisper`
`QUICKER_VOICE_PROVIDER`	`cpu`	ONNX provider: `cpu` / `directml` (Windows GPU) / `cuda` / `coreml`
`QUICKER_VOICE_NUM_THREADS`	`4`	CPU thread count when provider is `cpu`
`QUICKER_VOICE_LOG_LEVEL`	`INFO`	Logging

Release automation

Trigger	What runs
*Git tag `v..` push**	`.github/workflows/release.yml` — runtime zip + channel manifest → GitHub Release + Bitiful
Git tag `model-sensevoice` push	Same workflow — model zip → GitHub Release + Bitiful
Local one-shot	`publish/Publish-VoiceAsrRelease.ps1` — optional `-UploadBitiful` for manual mirror sync

# Runtime only (typical)
pwsh ./publish/Publish-VoiceAsrRelease.ps1 -SkipBuild -UploadBitiful -UpdateChannelJson

# First-time or model update
pwsh ./publish/Publish-VoiceAsrRelease.ps1 -SkipBuild -PublishModel -UploadBitiful -UpdateChannelJson

Release layout

Asset	GitHub tag	Filename
Runtime	`v0.1.1`	`voice-asr-runtime-0.1.1-win-x64.zip`
Model	`model-sensevoice` (fixed)	`voice-asr-model-sensevoice.zip`

# CI: push runtime tag
git tag v0.1.1 && git push origin v0.1.1

# CI: push model (only when model files change)
git tag -f model-sensevoice && git push origin model-sensevoice --force

# Local full pipeline (monorepo root)
pwsh ./publish/Publish-VoiceAsrRelease.ps1 -SkipBuild -UploadBitiful -UpdateChannelJson

# voice-asr-runtime repo only
pwsh -NoProfile -File ./publish/Publish-VoiceAsrRelease.ps1 -SkipBuild -UploadBitiful

# Bitiful only (after GitHub release exists)
pwsh -NoProfile -File ./publish/Upload-VoiceAsrToBitiful.ps1 -Version 0.1.0 -UseLocalVoiceRoot

Bitiful upload runs automatically in GitHub Actions on every release tag. Configure repo Secrets: BITIFUL_ACCESS_KEY, BITIFUL_SECRET_KEY, BITIFUL_BUCKET_NAME; optional Variables: BITIFUL_ENDPOINT_URL, BITIFUL_VOICE_ASR_OBJECT_PREFIX. Local fallback: publish/Upload-VoiceAsrToBitiful.ps1 (see publish/.env.example).

Domestic mirror (Bitiful) — same bucket layout as QuickerAgent:

Asset	URL pattern
Runtime zip	`https://s3.bitiful.net/quicker-pkgs/quicker-rpc/voice-asr/voice-asr-runtime-<ver>-win-x64.zip`
Model zip	`https://s3.bitiful.net/quicker-pkgs/quicker-rpc/voice-asr/voice-asr-model-sensevoice.zip`
version.txt	`https://s3.bitiful.net/quicker-pkgs/quicker-rpc/voice-asr/version.txt`

Tauri 一键安装 tries *MirrorUrl first (Bitiful), then GitHub release; verifies *Sha256 when set in voice-plugin-channel.json.

Packaging (Windows)

pwsh -NoProfile -File ./scripts/build-win.ps1
pwsh -NoProfile -File ./scripts/package-release.ps1
# -> publish/voice-asr-runtime-<ver>-win-x64.zip
# -> publish/voice-asr-model-sensevoice.zip

User install (Tauri)：设置 → 本地语音输入 → 一键安装。

Dev without network: Tauri install copies from voice-asr-runtime/dist/ + models/sensevoice/ when present.

Installed layout:

Documents/QuickerAgent/plugins/voice-asr/
  manifest.json
  settings.json
  runtime/quicker-voice-runtime.exe
  runtime/_internal/...
  models/sensevoice/tokens.txt
  models/sensevoice/model.int8.onnx

Fork as standalone repo

This directory is designed to be split out:

cd voice-asr-runtime
git init
git add .
git commit -m "chore: initial quicker-voice-runtime fork"

QuickerAgent consumes it via:

dev: pnpm voice:dev-server → uv run quicker-voice-runtime
release (planned): Tauri copies/spawns packaged quicker-voice-runtime.exe under Documents/QuickerAgent/plugins/voice-asr/

Protocol

Host ↔ Runtime messages are documented in docs/voice-input-plugin.md.

Client reference: agent-gui/lib/voice-input/voice-input-ws-client.ts.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
.github/workflows		.github/workflows
models		models
packaging		packaging
publish		publish
scripts		scripts
src/quicker_voice_runtime		src/quicker_voice_runtime
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
manifest.example.json		manifest.example.json
pyproject.toml		pyproject.toml
quicker-voice-runtime.spec		quicker-voice-runtime.spec
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

quicker-voice-runtime — local ASR server for QuickerAgent

Quick start

Backends

Environment

Release automation

Packaging (Windows)

Fork as standalone repo

Protocol

License

About

Uh oh!

Releases 5

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

quicker-voice-runtime — local ASR server for QuickerAgent

Quick start

Backends

Environment

Release automation

Packaging (Windows)

Fork as standalone repo

Protocol

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages