Any AI client. Any model. Any format. One endpoint.
APIBypass is a macOS menu bar app that sits between your AI tools and upstream API providers — translating formats, remapping model names, injecting parameters, and launching Claude Code with multi-model routing. Configure once, use everywhere.
Install · Quickstart · Features · Architecture · Settings
English · 简体中文
Claude Code speaks Anthropic. Most models speak OpenAI. Different models need different temperatures, thinking budgets, and context limits. Switching providers usually means reconfiguring every client.
APIBypass solves all of this at the network layer — one local endpoint, zero client changes:
- Format translation — Anthropic ↔ OpenAI in both directions, including SSE streaming, tool calls, and thinking mode. Only translates when formats don't match.
- Model mapping — your client asks for
claude-sonnet-4-6, APIBypass routes it to any model you choose. - Parameter injection — temperature, top-p, thinking mode, custom JSON fields — set once per model, applied to every request.
- Claude Code launcher — assign different providers to Opus, Sonnet, Haiku, and Subagent roles. One click, one terminal, all models.
- Codex Adaptor — OpenAI Codex speaks the new Responses API format that virtually no proxy supports. APIBypass translates it to Chat Completions, letting you use Codex with any model behind any provider — not just OpenAI. Plus CDP injection to unlock plugin marketplace, force plugin install, and more in the Codex Electron app.
- Keychain security — API keys in macOS Keychain, never in plaintext. All traffic stays on your machine.
Download the latest .dmg from Releases, drag to Applications, done. On first launch, allow network connections when prompted.
git clone https://github.com/panando/APIBypass.git
cd APIBypass
swift run # debug mode
# or
swift build -c release && .build/arm64-apple-macosx/release/APIBypassRequires macOS 14.0+, Swift 6.0+, Xcode 16.0+.
Click the APIBypass icon in the menu bar — the server auto-starts on 127.0.0.1:8390. Green dot = running.
Menu bar → "Configure..." → create a provider:
| Field | Description | Example |
|---|---|---|
| Provider Name | A label | My DeepSeek |
| API Provider | OpenAI or Anthropic | OpenAI |
| Base URL | Upstream API endpoint | https://api.deepseek.com/v1 |
| API Key | Your upstream key | Stored in Keychain |
Inside each provider, create mappings:
| Field | Description | Example |
|---|---|---|
| Incoming Model | What your client sends | claude-sonnet-4-6 |
| Actual Model | What the upstream expects | deepseek-chat |
Set your AI client's base URL to http://127.0.0.1:8390/v1. The API Key field can be anything — APIBypass replaces it with your real key.
Menu bar → "Launch Claude Code":
- Pick a provider (base URL and token auto-configured)
- Choose a terminal
- Select models for Anthropic/Opus/Sonnet/Haiku/Subagent roles
- Set effort level
- Click "Launch"
Claude Code opens with all environment variables set, routing each model role through your chosen provider.
The core superpower. Full bidirectional translation of request bodies, responses, SSE event streams, tool calls, and thinking/redacted_thinking blocks. Smart detection: only translates when the client format ≠ provider format.
| Endpoint | Description |
|---|---|
POST /v1/chat/completions |
OpenAI Chat Completions |
POST /v1/messages |
Anthropic Messages |
GET /v1/models |
Model listing |
Break Claude Code's single-model limitation. Assign different upstream models to each Claude Code role — Opus, Sonnet, Haiku, and Subagent — all in one session.
- 7 terminal options: Terminal.app, iTerm2, Alacritty, Kitty, Warp, Hyper, Warple
- Effort level selector (none → max)
- Cache fix: strips
cchbilling headers, controlsCLAUDE_CODE_ATTRIBUTION_HEADER - 1M context fix: auto-appends
[1m]suffix for long-context models - Launch templates: save and switch between model configurations
OpenAI's Codex uses the Responses API — a new format that almost no proxy or third-party provider supports. APIBypass is the first macOS tool to bridge this gap: it translates Responses API ↔ Chat Completions in real time, and passes requests through APIBypass's model mapping and parameter injection pipeline. The result: Codex works with any provider, any model — DeepSeek, Qwen, GLM, MiniMax, not just OpenAI.
Codex ──Responses API──▶ Codex Adaptor (:15721) ──Chat Completions──▶ APIBypass (:8390) ──▶ Upstream
Wire protocol — Choose between Chat Completions or Responses API as the exposed wire format, with contextual guidance on when to use each.
Reasoning configuration — Auto-detect or manually configure thinking/effort parameters per provider. Supports DeepSeek (thinking), OpenRouter (reasoning_effort), SiliconFlow (thinking), MiniMax (thinking), Qwen (enable_thinking), and more — each with configurable budget tokens and effort levels.
Custom models — Define display name aliases mapped to APIBypass model mappings, with configurable context windows. Models are automatically synced to ~/.codex/providers.json for Codex compatibility.
CDP Enhancements — The Codex Electron app exposes a Chrome DevTools Protocol debug port. APIBypass connects to it and injects JavaScript to unlock hidden capabilities:
- Force entry unlock — bypass the waitlist/sign-in gate
- Plugin marketplace unlock — access the plugin marketplace directly
- Force plugin install — install any plugin without restrictions
- Settings are pushed live via WebSocket — changes take effect immediately without restart
Real-time logs — Thread-safe ring buffer (2000 entries) displayed via polling timer. Filter by level, copy all, export to file, or clear. No @Published/Combine — avoids background-thread AutoLayout crashes.
Config auto-recovery — If UserDefaults is cleared, the next launch automatically recovers wire API, reasoning config, and custom models from ~/.codex/providers.json.
Start from menu bar "Codex Adaptor", point Codex to http://127.0.0.1:15721/v1.
- Model aliasing: Any incoming model name → any actual upstream model
- Parameter injection per mapping: temperature, max_tokens, top_p, frequency_penalty, presence_penalty
- Thinking mode toggle: one-click on/off, compatible with both Anthropic and OpenAI formats
- Custom JSON injection: inject arbitrary parameters with automatic type detection (bool, int, float, JSON, string)
- Local model cleanup: strips Ollama/LM Studio-specific params before forwarding to cloud APIs
- Independent provider configs (API type, base URL, API key) — reuse across mappings
- Environment variables per provider for Claude Code integration (manual, model-mapping, keychain, base-url types)
- Auto-migration from legacy format
One-click toggle for pure proxy mode — requests pass through without format translation, keeping model mapping and parameter injection active.
- API keys in macOS Keychain, never plaintext on disk
- All traffic processed locally — no cloud relay, no telemetry
- Open source, MIT licensed
Codex ──▶ Codex Adaptor (:15721) ──▶┐
│
Client (Claude Code / Cursor / Anything) │
│ │
▼ │
┌────────────────────────────────────────▼┐
│ HTTPServer (Hummingbird 2.0) │
│ :8390 │
│ │
│ POST /v1/chat/completions │
│ POST /v1/messages │
│ GET /v1/models │
└──────────────┬──────────────────────────┘
│
┌───────┴────────┐
▼ ▼
┌─────────────┐ ┌──────────────────┐
│ ProxyEngine │ │ FormatTranslator │
│ • model map │ │ • req → req │
│ • param inj │ │ • resp → resp │
│ • strip loc │ │StreamTranslator │
└──────┬──────┘ │ • SSE ↔ SSE │
│ │Rectifier │
│ │ • thinking fix │
│ │ • budget fix │
│ └────────┬─────────┘
│ │
▼ ▼
┌─────────────────────────────────┐
│ Upstream Provider (OpenAI / │
│ Anthropic / DeepSeek / etc.) │
└─────────────────────────────────┘
Menu bar → "Settings...":
- Language: 中文 / English, takes effect immediately
- Server Port: Default 8390, restart to apply
- About: Version, description, GitHub link, MIT license
- SwiftUI — macOS menu bar app + windows
- Hummingbird 2.0 — HTTP server
- Keychain Services — API key storage with caching
- async/await — networking including SSE streaming
- ServiceLifecycle — service lifecycle management
If APIBypass saves you time, please star the repo — it helps others find it too.

