wspr

wspr is a push-to-talk voice dication tool for Linux/X11. You hold a hotkey (default Super+Space), speak, and release. The audio is transcribed locally with faster-whisper and typed into whatever window has focus via xdotool. No cloud, no daemon framework, just a single long-running Python script started by your graphical session.

Requirements

Python 3.11+ (3.14 recommended; uses the stdlib tomllib)
An X11 session
xdotool for typing into the focused window:
A working microphone (PortAudio, pulled in by sounddevice)

Install (recommended)

./install.sh installs wspr for the current user and registers an XDG autostart entry that starts it with your graphical session:

It lays things out like this:

What	Location
App code + private venv	`~/.local/share/wspr/`
Launcher executable	`~/.local/bin/wspr`
Config	`~/.config/wspr/wspr.toml` (created only if absent)
XDG autostart entry	`~/.config/autostart/wspr.desktop`

The installer is safe to re-run — it upgrades the code, dependencies, and autostart entry, but never overwrites an existing config. Make sure ~/.local/bin is on your PATH to run wspr directly.

Managing wspr

wspr grabs the hotkey globally, so only one instance can run at a time. Stop the running copy before launching a dev copy from the repo (below):
pkill -f wspr.py        # stop the running instance
pgrep -af wspr.py       # check whether it's running

Uninstall

./uninstall.sh            # remove app + autostart entry, keep config
./uninstall.sh --purge    # also remove ~/.config/wspr

Running from the repo (development)

To run directly from a checkout without installing, create a local venv and run the script:

uv venv .venv
uv pip install --python .venv/bin/python faster-whisper numpy sounddevice python-xlib
./.venv/bin/python wspr.py

On first run the configured model is downloaded to your Hugging Face cache. Then:

Hold the hotkey (default Super+Space) and speak.
Release it — wspr transcribes the audio.
The text is typed into the focused window.

Press Ctrl-C to quit.

Configuration

Settings live in a TOML file. wspr looks for one in this order and uses the first that exists:

Priority	Location	Purpose
1	`$WSPR_CONFIG`	Explicit override — point it at any file: `WSPR_CONFIG=~/my.toml ./.venv/bin/python wspr.py`. Used as-is even if it doesn't exist (then defaults apply).
2	`./wspr.toml` (next to `wspr.py`)	The repo default. The common case.
3	`~/.config/wspr/wspr.toml`	Per-user / OS-installed location (XDG).
—	(none found)	Built-in defaults are used. wspr never writes a config file.

Higher priority wins: $WSPR_CONFIG overrides the repo file, which overrides the XDG file. The search stops at the first match.

Options

wspr.toml ships with these defaults:

[hotkey]
# Press-and-hold combo. Modifiers: super, ctrl, alt, shift.
# Trigger: a function key (f1-f20), a named key (space, enter, tab, esc,
# backspace), or a single character. Examples: "super+f1", "ctrl+alt+space",
# "f9".
combo = "super+space"

[model]
size = "small.en"     # tiny.en / base.en / small.en / medium / large-v3
device = "cpu"        # cpu / cuda
compute_type = "int8" # int8 (CPU) / float16 (GPU)

Note: the combo must not already be bound by your desktop environment or window manager — Super+Space in particular is a common default for input-method/layout switching (GNOME, KDE) and app launchers. If something else has already grabbed the key, wspr exits at startup with Could not grab super+space: it's already bound. — free the binding in your DE/WM or pick a different combo.

Edit the file and restart wspr — no code changes needed. A larger size (e.g. medium) improves accuracy at the cost of speed; a smaller one (base.en, tiny.en) is faster. device = "cuda" with compute_type = "float16" runs on a GPU.

CUDA

ctranslate2 (faster-whisper's engine) needs CUDA 12's cuBLAS and cuDNN 9 at runtime, which distro CUDA packages often don't provide (e.g. Arch's cuda 13 only ships libcublas.so.13). install.sh handles this automatically on machines with an NVIDIA GPU: it installs the nvidia-cublas-cu12 and nvidia-cudnn-cu12 wheels into the venv and the launcher puts them on LD_LIBRARY_PATH. For a dev checkout, do the same by hand:

uv pip install --python .venv/bin/python nvidia-cublas-cu12 nvidia-cudnn-cu12
sp=$(.venv/bin/python -c 'import site; print(site.getsitepackages()[0])')
LD_LIBRARY_PATH="$sp/nvidia/cublas/lib:$sp/nvidia/cudnn/lib" ./.venv/bin/python wspr.py

The audio format (16 kHz mono) and transcription language (English) are fixed in the code — both are requirements of the .en Whisper models — so they are not configurable.

Files

File	Purpose
`wspr.py`	The dictation engine.
`wspr.toml`	Default configuration (shipped with the repo).
`install.sh`	Installs wspr for the current user (venv, launcher, config, autostart entry).
`uninstall.sh`	Reverses the install (`--purge` also removes config).

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.gitignore		.gitignore
README.md		README.md
install.sh		install.sh
uninstall.sh		uninstall.sh
wspr.py		wspr.py
wspr.toml		wspr.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

wspr

Requirements

Install (recommended)

Managing wspr

Uninstall

Running from the repo (development)

Configuration

Options

CUDA

Files

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

wspr

Requirements

Install (recommended)

Managing wspr

Uninstall

Running from the repo (development)

Configuration

Options

CUDA

Files

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages