wspr is a push-to-talk voice dication tool for Linux/X11. You hold a hotkey
(default Super+Space), speak, and release. The audio is transcribed locally with
faster-whisper and typed into whatever window has focus via xdotool.
No cloud, no daemon framework, just a single long-running Python script started
by your graphical session.
- Python 3.11+ (3.14 recommended; uses the stdlib
tomllib) - An X11 session
xdotoolfor typing into the focused window:- A working microphone (PortAudio, pulled in by
sounddevice)
./install.sh installs wspr for the current user and registers an XDG
autostart entry that starts it with your graphical session:
It lays things out like this:
| What | Location |
|---|---|
| App code + private venv | ~/.local/share/wspr/ |
| Launcher executable | ~/.local/bin/wspr |
| Config | ~/.config/wspr/wspr.toml (created only if absent) |
| XDG autostart entry | ~/.config/autostart/wspr.desktop |
The installer is safe to re-run — it upgrades the code, dependencies, and
autostart entry, but never overwrites an existing config. Make sure
~/.local/bin is on your PATH to run wspr directly.
wspr grabs the hotkey globally, so only one instance can run at a time. Stop the running copy before launching a dev copy from the repo (below):
pkill -f wspr.py # stop the running instance pgrep -af wspr.py # check whether it's running
./uninstall.sh # remove app + autostart entry, keep config
./uninstall.sh --purge # also remove ~/.config/wsprTo run directly from a checkout without installing, create a local venv and run the script:
uv venv .venv
uv pip install --python .venv/bin/python faster-whisper numpy sounddevice python-xlib
./.venv/bin/python wspr.pyOn first run the configured model is downloaded to your Hugging Face cache. Then:
- Hold the hotkey (default Super+Space) and speak.
- Release it — wspr transcribes the audio.
- The text is typed into the focused window.
Press Ctrl-C to quit.
Settings live in a TOML file. wspr looks for one in this order and uses the first that exists:
| Priority | Location | Purpose |
|---|---|---|
| 1 | $WSPR_CONFIG |
Explicit override — point it at any file: WSPR_CONFIG=~/my.toml ./.venv/bin/python wspr.py. Used as-is even if it doesn't exist (then defaults apply). |
| 2 | ./wspr.toml (next to wspr.py) |
The repo default. The common case. |
| 3 | ~/.config/wspr/wspr.toml |
Per-user / OS-installed location (XDG). |
| — | (none found) | Built-in defaults are used. wspr never writes a config file. |
Higher priority wins: $WSPR_CONFIG overrides the repo file, which overrides
the XDG file. The search stops at the first match.
wspr.toml ships with these defaults:
[hotkey]
# Press-and-hold combo. Modifiers: super, ctrl, alt, shift.
# Trigger: a function key (f1-f20), a named key (space, enter, tab, esc,
# backspace), or a single character. Examples: "super+f1", "ctrl+alt+space",
# "f9".
combo = "super+space"
[model]
size = "small.en" # tiny.en / base.en / small.en / medium / large-v3
device = "cpu" # cpu / cuda
compute_type = "int8" # int8 (CPU) / float16 (GPU)Note: the combo must not already be bound by your desktop environment or window manager — Super+Space in particular is a common default for input-method/layout switching (GNOME, KDE) and app launchers. If something else has already grabbed the key, wspr exits at startup with
Could not grab super+space: it's already bound.— free the binding in your DE/WM or pick a different combo.
Edit the file and restart wspr — no code changes needed. A larger size
(e.g. medium) improves accuracy at the cost of speed; a smaller one
(base.en, tiny.en) is faster. device = "cuda" with
compute_type = "float16" runs on a GPU.
ctranslate2 (faster-whisper's engine) needs CUDA 12's cuBLAS and cuDNN 9 at
runtime, which distro CUDA packages often don't provide (e.g. Arch's cuda 13
only ships libcublas.so.13). install.sh handles this automatically on
machines with an NVIDIA GPU: it installs the nvidia-cublas-cu12 and
nvidia-cudnn-cu12 wheels into the venv and the launcher puts them on
LD_LIBRARY_PATH. For a dev checkout, do the same by hand:
uv pip install --python .venv/bin/python nvidia-cublas-cu12 nvidia-cudnn-cu12
sp=$(.venv/bin/python -c 'import site; print(site.getsitepackages()[0])')
LD_LIBRARY_PATH="$sp/nvidia/cublas/lib:$sp/nvidia/cudnn/lib" ./.venv/bin/python wspr.pyThe audio format (16 kHz mono) and transcription language (English) are fixed
in the code — both are requirements of the .en Whisper models — so they are
not configurable.
| File | Purpose |
|---|---|
wspr.py |
The dictation engine. |
wspr.toml |
Default configuration (shipped with the repo). |
install.sh |
Installs wspr for the current user (venv, launcher, config, autostart entry). |
uninstall.sh |
Reverses the install (--purge also removes config). |