Skip to content

jamiezigelbaum/VoiceKey

Repository files navigation

VoiceKey

VoiceKey is a tiny macOS menu bar app for one-key access to web-based AI voice experiences without keeping a browser window open.

The first provider is ChatGPT Voice on chatgpt.com. Claude Web is planned as the next provider once the app shell and trusted-click bridge are solid.

Install

Download VoiceKey-0.1.0-macOS.dmg from the latest GitHub Release, open it, and drag VoiceKey.app into /Applications.

On first launch:

  1. Choose Show ChatGPT from the menu bar item.
  2. Sign in to ChatGPT.
  3. Grant microphone and accessibility permissions if macOS asks.
  4. Press F16 to toggle ChatGPT Voice.

Homebrew Cask is available for the unsigned prerelease:

brew tap jamiezigelbaum/voicekey
brew install --cask voicekey

This build is not Apple Developer ID signed or notarized yet. macOS may require right-clicking VoiceKey.app and choosing Open, or approving it in System Settings > Privacy & Security.

Goals

  • Self-contained native macOS menu bar app.
  • No Hammerspoon, browser extension, Electron, or third-party package manager.
  • App-owned login/session through an embedded web view.
  • App-owned global hotkey.
  • App-owned microphone permission.
  • Browser-free day-to-day workflow after first setup.

Current Status

This is an early native Swift/AppKit app with:

  • menu bar controls
  • bundled app icon
  • dynamic menu bar icon that reflects the configured hotkey and provider state
  • configurable global hotkey, defaulting to F16
  • settings window for recording a new hotkey
  • persistent WKWebView session for chatgpt.com
  • WebKit microphone permission hook
  • DOM-to-native-click bridge for ChatGPT Voice controls
  • visible provider status for loading, sign-in required, ready, starting, active, stopping, and needs-attention states
  • fixture-tested DOM probes that distinguish ChatGPT Voice Mode from text dictation controls

The next milestone is live testing against ChatGPT's current web UI after sign-in, especially first-run voice selection, microphone prompts, and end-call behavior.

Build From Source

swift build

To build an app bundle:

./scripts/build-app.zsh
open .build/VoiceKey.app

To regenerate the production app icon from the design master:

python3 -m venv /tmp/voicekey-icon-venv
/tmp/voicekey-icon-venv/bin/python -m pip install pillow
/tmp/voicekey-icon-venv/bin/python ./scripts/generate_app_icon.py

To package a local release:

./scripts/package-release.zsh

See docs/RELEASE.md for signing, notarization, GitHub Release, and Homebrew cask steps.

Usage

VoiceKey keeps the ChatGPT window hidden during normal hotkey use. It only brings the window forward when sign-in is needed or when you choose Show ChatGPT.

The menu shows the currently assigned voice hotkey in the native shortcut column. The menu bar icon shows a compact version of the same hotkey, plus a simple shape state: ready, loading, attention, or voice active. Choose Settings... to record a different global hotkey.

If ChatGPT appears to hear phrases you did not say, change macOS audio output to headphones or another output path that the microphone cannot hear. VoiceKey sends one start click per F16 press; repeated phantom turns are usually speaker audio feeding back into the microphone.

Privacy

VoiceKey does not handle OpenAI passwords, OAuth tokens, or session cookies directly. Authentication happens through the provider's normal web login inside the app's web view. The web session is persisted by WebKit on your Mac.

Architecture

VoiceKey is intentionally small:

  • VoiceKeyAppDelegate: menu bar and hotkey lifecycle.
  • GlobalHotKey: Carbon RegisterEventHotKey wrapper.
  • WebWindowController: persistent WKWebView, mic permission, native click bridge.
  • ChatGPTProvider: provider-specific status, retry, and start/stop behavior.
  • ChatGPTDOMProbe: ChatGPT DOM selectors shared by the app and fixture tests.

Provider support should stay behind a simple shape:

prepare()
show()
toggleVoice()
startVoice()
stopVoice()

Notes

VoiceKey does not handle OpenAI passwords, OAuth tokens, or session cookies directly. Authentication happens through the provider's normal web login inside the app's web view. The web session is persisted by WebKit.

VoiceKey automates a human-facing web UI. It should not bypass provider limits, spoof private APIs, or run unattended conversations.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors