VoiceKey is a tiny macOS menu bar app for one-key access to web-based AI voice experiences without keeping a browser window open.
The first provider is ChatGPT Voice on chatgpt.com. Claude Web is planned as
the next provider once the app shell and trusted-click bridge are solid.
Download VoiceKey-0.1.0-macOS.dmg from the latest GitHub Release, open it,
and drag VoiceKey.app into /Applications.
On first launch:
- Choose
Show ChatGPTfrom the menu bar item. - Sign in to ChatGPT.
- Grant microphone and accessibility permissions if macOS asks.
- Press
F16to toggle ChatGPT Voice.
Homebrew Cask is available for the unsigned prerelease:
brew tap jamiezigelbaum/voicekey
brew install --cask voicekeyThis build is not Apple Developer ID signed or notarized yet. macOS may require
right-clicking VoiceKey.app and choosing Open, or approving it in System
Settings > Privacy & Security.
- Self-contained native macOS menu bar app.
- No Hammerspoon, browser extension, Electron, or third-party package manager.
- App-owned login/session through an embedded web view.
- App-owned global hotkey.
- App-owned microphone permission.
- Browser-free day-to-day workflow after first setup.
This is an early native Swift/AppKit app with:
- menu bar controls
- bundled app icon
- dynamic menu bar icon that reflects the configured hotkey and provider state
- configurable global hotkey, defaulting to F16
- settings window for recording a new hotkey
- persistent
WKWebViewsession forchatgpt.com - WebKit microphone permission hook
- DOM-to-native-click bridge for ChatGPT Voice controls
- visible provider status for loading, sign-in required, ready, starting, active, stopping, and needs-attention states
- fixture-tested DOM probes that distinguish ChatGPT Voice Mode from text dictation controls
The next milestone is live testing against ChatGPT's current web UI after sign-in, especially first-run voice selection, microphone prompts, and end-call behavior.
swift buildTo build an app bundle:
./scripts/build-app.zsh
open .build/VoiceKey.appTo regenerate the production app icon from the design master:
python3 -m venv /tmp/voicekey-icon-venv
/tmp/voicekey-icon-venv/bin/python -m pip install pillow
/tmp/voicekey-icon-venv/bin/python ./scripts/generate_app_icon.pyTo package a local release:
./scripts/package-release.zshSee docs/RELEASE.md for signing, notarization, GitHub Release, and Homebrew cask steps.
VoiceKey keeps the ChatGPT window hidden during normal hotkey use. It only
brings the window forward when sign-in is needed or when you choose
Show ChatGPT.
The menu shows the currently assigned voice hotkey in the native shortcut column.
The menu bar icon shows a compact version of the same hotkey, plus a simple
shape state: ready, loading, attention, or voice active. Choose Settings... to
record a different global hotkey.
If ChatGPT appears to hear phrases you did not say, change macOS audio output to headphones or another output path that the microphone cannot hear. VoiceKey sends one start click per F16 press; repeated phantom turns are usually speaker audio feeding back into the microphone.
VoiceKey does not handle OpenAI passwords, OAuth tokens, or session cookies directly. Authentication happens through the provider's normal web login inside the app's web view. The web session is persisted by WebKit on your Mac.
VoiceKey is intentionally small:
VoiceKeyAppDelegate: menu bar and hotkey lifecycle.GlobalHotKey: CarbonRegisterEventHotKeywrapper.WebWindowController: persistentWKWebView, mic permission, native click bridge.ChatGPTProvider: provider-specific status, retry, and start/stop behavior.ChatGPTDOMProbe: ChatGPT DOM selectors shared by the app and fixture tests.
Provider support should stay behind a simple shape:
prepare()
show()
toggleVoice()
startVoice()
stopVoice()
VoiceKey does not handle OpenAI passwords, OAuth tokens, or session cookies directly. Authentication happens through the provider's normal web login inside the app's web view. The web session is persisted by WebKit.
VoiceKey automates a human-facing web UI. It should not bypass provider limits, spoof private APIs, or run unattended conversations.