Whisper Key - Local Speech-to-Text

Global hotkeys to record speech and transcribe directly to your cursor.

Questions or ideas? Discord

✨ Features

Global Hotkey: Start recording speech from any app
Auto-Paste: Transcribe directly to cursor
Auto-Send: Optionally auto-send with ENTER keypress
Local/Offline: Voice data never leaves your computer
CPU Ready: Small, efficient models available
GPU Ready: Support for both NVIDIA & AMD cards
Cross-platform: Works on Windows and macOS
Voice Commands: Trigger shortcuts, text snippets, and shell commands by voice — docs
Configurable: Customize hotkeys, models, and much more

🚀 Quick Start

From PyPI (Recommended)

Requires Python 3.11-3.13

# With pipx (isolated environment)
pipx install whisper-key-local

# Or with pip
pip install whisper-key-local

Then run: whisper-key (or wk for short)

Windows App

Download whisper-key.exe from the latest release
Run whisper-key.exe

From Source

git clone https://github.com/PinW/whisper-key-local.git
cd whisper-key-local
pip install -e .
python whisper-key.py

🎤 Basic Usage

Hotkey	Windows	macOS
Start recording	`Ctrl+Win`	`Fn+Ctrl`
Stop & transcribe	`Ctrl`	`Fn`
Stop & auto-send	`Alt`	`Option`
Cancel recording	`Esc`	`Shift`
Voice command mode	`Alt+Win`	`Fn+Command`

Open the system tray / menu bar icon to:

Toggle auto-paste vs clipboard-only
Change transcription model
Select audio device

🗣️ Voice Commands

Speak trigger phrases to run shell commands and more. Define in:

Windows: %APPDATA%\whisperkey\commands.yaml
macOS: ~/.whisperkey/commands.yaml

commands:
  # Send a keyboard shortcut
  - trigger: "undo"
    hotkey: "ctrl+z"
  # Deliver pre-written text
  - trigger: "my email"
    type: "[email protected]"
  # Run a shell command
  - trigger: "open notepad"
    run: 'notepad.exe'

See the Voice Commands Guide for full details.

⚡ GPU Acceleration

Whisper Key detects your GPU on first launch and offers one-press install of the required runtime libraries. Supports NVIDIA (CUDA) and AMD (ROCm).

For manual setup or troubleshooting, see the GPU Setup Guide.

⚙️ Configuration

Local settings at:

Windows: %APPDATA%\whisperkey\user_settings.yaml
macOS: ~/.whisperkey/user_settings.yaml

Delete this file and restart app to reset to defaults.

Option	Default	Notes
Whisper
`whisper.model`	`tiny`	Any model defined in `whisper.models`
`whisper.device`	`cpu`	cpu or cuda (NVIDIA/AMD GPU) — setup guide
`whisper.compute_type`	`int8`	int8/float16/float32
`whisper.language`	`auto`	auto or language code (en, es, fr, etc.)
`whisper.beam_size`	`5`	Higher = more accurate but slower (1-10)
`whisper.initial_prompt`	`""`	Guide transcription style, language variant, or script
`whisper.hotwords`	`[]`	Words the model should favor (names, technical terms)
`whisper.models`	(see config)	Add custom HuggingFace or local models
Post-Processing
`post_processing.strip_trailing_period`	`false`	Strip trailing period from output
`post_processing.corrections`	`{}`	Fix recurring misheard words, e.g. `CAPEX: [cap x]`
Hotkeys
`hotkey.recording_hotkey`	`ctrl+win` / `fn+ctrl`	Windows / macOS
`hotkey.stop_key`	`ctrl` / `fn`	Stop recording
`hotkey.auto_send_key`	`alt` / `option`	Stop + paste + Enter
`hotkey.cancel_combination`	`esc` / `shift`	Cancel recording
`hotkey.recording_mode`	`toggle`	toggle or push_to_talk
`hotkey.command_hotkey`	`alt+win` / `fn+command`	Voice command mode
Voice Activity Detection
`vad.vad_precheck_enabled`	`true`	Prevent hallucinations on silence
`vad.vad_onset_threshold`	`0.7`	Speech detection start (0.0-1.0)
`vad.vad_offset_threshold`	`0.55`	Speech detection end (0.0-1.0)
`vad.vad_min_speech_duration`	`0.1`	Min speech segment (seconds)
`vad.vad_realtime_enabled`	`true`	Auto-stop on silence
`vad.vad_silence_timeout_seconds`	`30.0`	Seconds before auto-stop
Audio
`audio.host`	`null`	Audio API (WASAPI, Core Audio, etc.)
`audio.channels`	`1`	1 = mono, 2 = stereo
`audio.dtype`	`float32`	float32/int16/int24/int32
`audio.max_duration`	`900`	Max recording seconds (0 = unlimited)
`audio.input_device`	`default`	Device ID or "default"
Clipboard
`clipboard.auto_paste`	`true`	false = clipboard only
`clipboard.delivery_method`	`paste`	paste (Ctrl+V) or type (direct injection)
`clipboard.paste_hotkey`	`ctrl+v` / `cmd+v`	Paste key simulation
`clipboard.paste_pre_paste_delay`	`0.05`	Delay after copy, before paste hotkey (seconds)
`clipboard.paste_preserve_clipboard`	`true`	Restore clipboard after paste
`clipboard.paste_clipboard_restore_delay`	`0.5`	Delay before clipboard restore (seconds)
`clipboard.type_also_copy_to_clipboard`	`false`	Also copy to clipboard in type mode
`clipboard.type_auto_enter_delay`	`0.15`	Delay before ENTER after typing (seconds)
`clipboard.type_auto_enter_delay_per_100_chars`	`0.1`	Extra ENTER delay per 100 typed chars (seconds)
Logging
`logging.level`	`INFO`	DEBUG/INFO/WARNING/ERROR/CRITICAL
`logging.file.enabled`	`true`	Write to app.log
`logging.log_transcriptions`	`false`	Include transcribed text in log (privacy)
`logging.console.enabled`	`true`	Print to console
`logging.console.level`	`WARNING`	Console verbosity
Audio Feedback
`audio_feedback.enabled`	`true`	Play sounds on record/stop
`audio_feedback.transcription_complete_enabled`	`false`	Play sound on transcription complete
`audio_feedback.ready_enabled`	`true`	Play sound when app finishes loading
`audio_feedback.start_sound`	`assets/sounds/...`	Custom sound file path
`audio_feedback.stop_sound`	`assets/sounds/...`	Custom sound file path
`audio_feedback.cancel_sound`	`assets/sounds/...`	Custom sound file path
`audio_feedback.transcription_complete_sound`	`assets/sounds/...`	Custom sound file path
`audio_feedback.ready_sound`	`assets/sounds/...`	Custom sound file path
System Tray
`system_tray.enabled`	`true`	Show tray icon
`system_tray.tooltip`	`Whisper Key`	Hover text
Terminal Title
`terminal_title.idle`	`""`	Tab title prefix when idle: static string or `[prefix, seconds]` animation frames
`terminal_title.recording`	🔴 blink	Tab title prefix while recording
`terminal_title.processing`	`""`	Tab title prefix while transcribing
Console
`console.start_hidden`	`false`	Hide console after startup (whisper-key-hideable.exe only)
Update
`update.mode`	`prompt`	prompt or auto
Voice Commands
`voice_commands.enabled`	`true`	Enable voice command mode

📁 Model Cache

Default path for transcription models (via HuggingFace):

Windows: %USERPROFILE%\.cache\huggingface\hub\
macOS: ~/.cache/huggingface/hub/

Contributing

Check the roadmap for planned features and see CONTRIBUTING.md for guidelines. Please open an issue before starting work on new features.

📦 Dependencies

Cross-platform: faster-whisper · numpy · sounddevice · soxr · pyperclip · ruamel.yaml · pystray · Pillow · playsound3 · ten-vad · hf-xet

Windows: global-hotkeys · pywin32

macOS: pyobjc-framework-Quartz · pyobjc-framework-ApplicationServices

Name		Name	Last commit message	Last commit date
Latest commit History 472 Commits
.claude		.claude
.github/workflows		.github/workflows
docs		docs
pyapp-build		pyapp-build
src/whisper_key		src/whisper_key
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
whisper-key.py		whisper-key.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper Key - Local Speech-to-Text

✨ Features

🚀 Quick Start

From PyPI (Recommended)

Windows App

From Source

🎤 Basic Usage

🗣️ Voice Commands

⚡ GPU Acceleration

⚙️ Configuration

📁 Model Cache

Contributing

📦 Dependencies

About

Uh oh!

Releases 16

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Whisper Key - Local Speech-to-Text

✨ Features

🚀 Quick Start

From PyPI (Recommended)

Windows App

From Source

🎤 Basic Usage

🗣️ Voice Commands

⚡ GPU Acceleration

⚙️ Configuration

📁 Model Cache

Contributing

📦 Dependencies

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 16

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages