Minimal OpenAI-compatible HTTP proxy for providers that require
reasoning_content to be round-tripped in thinking mode — with built-in user
authentication, token quotas, and a web admin UI.
Cursor currently omits reasoning_content when replaying previous assistant
messages to an OpenAI-compatible provider. Some providers reject that request
with:
Provider returned error: {"error":{"code":"400","message":"Param Incorrect","param":"The reasoning_content in the thinking mode must be passed back to the API.","type":""}}
This proxy keeps a structured JSON cache of assistant messages returned by the
upstream provider. When a later /v1/chat/completions request contains the same
assistant message without reasoning_content, the proxy restores the cached
field before forwarding the request upstream.
This proxy must be deployed on a server with a public IP address (or a domain accessible from the internet).
Cursor does not make API calls from your local machine. When you configure a custom API provider in Cursor, Cursor's cloud servers send the requests to your endpoint — your local Cursor client is just a UI. This means:
localhostor127.0.0.1will not work — Cursor's servers can't reach your local machine.- The proxy must listen on a public IP or be accessible via a domain name (e.g., behind Nginx/Caddy with HTTPS).
- Make sure the port (default
8787) is open in your server's firewall/security group.
┌──────────────┐ ┌─────────────────────┐ ┌──────────────┐
│ Cursor IDE │──────▶│ Cursor Cloud Server │──────▶│ This Proxy │
│ (local app) │ │ (makes API calls) │ │ (your VPS) │
└──────────────┘ └─────────────────────┘ └──────┬───────┘
│
▼
┌──────────────┐
│ Upstream │
│ Provider │
└──────────────┘
Recommended setup:
# Example: deploy on a VPS with public IP
# Option A: direct access
listen_addr = "0.0.0.0:8787"
# Option B: behind Nginx/Caddy reverse proxy with HTTPS
# Nginx: proxy_pass http://127.0.0.1:8787;
# Caddy: reverse_proxy localhost:8787Then in Cursor, set Base URL to your public address:
https://your-domain.com/v1 or http://your-server-ip:8787/v1.
# Clone & build
git clone <repo-url> && cd cursor-reasoning-adapter
cargo build --release
# Copy example config
cp config.example.toml config.toml
# Edit config.toml — set your upstream base_url and api_keys
# Run
RUST_LOG=info ./target/release/cursor-reasoning-adapter
# Or use run.sh for daemon mode
chmod +x run.sh
./run.sh start
./run.sh log # tail logs
./run.sh status # check if running
./run.sh stopHealth check:
curl http://127.0.0.1:8787/healthzTwo configuration methods are supported: a TOML config file and environment variables. Both can be used together — environment variables always take priority over values in the TOML file.
Place a config.toml in the project root (next to Cargo.toml). See
config.example.toml for a full annotated example. Set
the CONFIG_PATH environment variable to use a different path.
[server]
listen_addr = "0.0.0.0:8787"
[logging]
body_max_chars = 4096
color = "auto"
[[upstreams]]
name = "xiaomi"
base_url = "https://api.xiaomimimo.com"
api_keys = ["tp-your-key-1", "tp-your-key-2"]
[[upstreams.model_rewrites]]
from = "gpt-5.3-codex"
to = "mimo-v2.5-pro"
[[upstreams]]
name = "openai"
base_url = "https://api.openai.com"
api_keys = ["sk-your-openai-key"]
models = ["gpt-4", "gpt-3.5-turbo"]| Variable | Required | Default |
|---|---|---|
UPSTREAM_BASE_URL |
yes (or use [[upstreams]]) |
— |
UPSTREAM_API_KEY |
no | — |
UPSTREAM_CHAT_COMPLETIONS_PATH |
no | /v1/chat/completions |
LISTEN_ADDR |
no | 127.0.0.1:8787 |
LOG_BODY_MAX_CHARS |
no | 4096 |
MESSAGE_LOG_MAX_CHARS |
no | 2000 |
LOG_COLOR |
no | auto |
CONFIG_PATH |
no | ./config.toml |
DATABASE_URL |
no | sqlite:data/adapter.db?mode=rwc |
JWT_SECRET |
no | built-in default |
ALLOW_REGISTRATION |
no | true |
RUST_LOG |
no | — |
Note: If
UPSTREAM_BASE_URLis set, it overrides all[[upstreams]]entries.
UPSTREAM_BASE_URLenv var creates a single upstream, overriding TOML.- Otherwise,
[[upstreams]]entries from TOML are used. - Legacy
[upstream]section is converted to an additional upstream entry. - Logging and server settings: env vars override TOML values.
In Cursor Settings → Models → OpenAI API Key, configure:
Base URL: https://your-public-domain.com/v1
API Key: sk-xxxx-xxxx (system key)
— or —
API Key: tp-xxxx-xxxx (upstream provider key)
Important: The
Base URLmust be reachable from the internet — Cursor's cloud servers make the actual API call, not your local machine.localhostand127.0.0.1will not work.
The adapter supports two types of API keys simultaneously:
- Created when a user registers or is created by an admin
- Full feature set: token quota enforcement, usage tracking, prompt audit
- Ideal for team distribution — each user gets their own key and quota
- The adapter uses its router to forward requests to upstream with upstream's own API keys (key rotation, retry on 401/403)
- The raw API key from your upstream provider (e.g., Xiaomi, OpenAI, Deepseek)
- Passthrough mode: the adapter recognizes it matches an upstream key in the database and forwards it directly — no key rotation
- No quota enforcement, no per-user audit — unlimited usage
- Ideal for personal use
Cursor request → Bearer token
├─ matches sk-xxx (system key)?
│ → look up user → check quota → route via upstream key rotation
├─ matches JWT?
│ → look up user → same as above
└─ matches tp-xxx (upstream key)?
→ look up upstream → forward directly (no quota, no audit)
Summary: distribute system sk-xxx keys for team members who need quotas and
audit; use upstream tp-xxx keys directly for personal / unlimited usage. Both
coexist on the same server.
- If only one upstream is configured (no
modelsfilter), it handles all requests. - If multiple upstreams exist, they are checked in order. The first upstream whose
modelslist contains the requested model is selected. - If no upstream accepts the model, the adapter returns HTTP 400.
Transform model names before forwarding to upstream. Useful when Cursor sends one name but the upstream expects another.
Exact match:
[[upstreams.model_rewrites]]
from = "gpt-5.3-codex"
to = "mimo-v2.5-pro"Regex match with capture groups:
[[upstreams.model_rewrites]]
from_pattern = "^gpt-(.*)$"
to_template = "mimo-{1}"Each upstream can have multiple API keys. Keys rotate via round-robin. On 401/403, the adapter automatically retries with the next key.
Cursor may send certain models using the OpenAI Responses API format (input
field instead of messages). The adapter auto-detects and converts Responses API
payloads to Chat Completions format before forwarding upstream.
Both /v1/chat/completions and /v1/responses endpoints are registered.
On first startup, a default admin account is created:
- Username:
admin - Password:
admin - API key: auto-generated (printed to logs, visible on dashboard)
| Page | URL | Description |
|---|---|---|
| Login | / |
Username/password login |
| Register | /register |
Self-registration |
| Dashboard | /dashboard |
Token usage stats, charts, API key management |
| Admin | /admin |
User management, quotas, prompt audit |
Public:
| Method | Path | Description |
|---|---|---|
| POST | /api/login |
Login, returns JWT |
| POST | /api/register |
Register new user |
| GET | /api/config |
{ "allow_registration": bool } |
Authenticated (JWT or API key):
| Method | Path | Description |
|---|---|---|
| GET | /api/me |
Current user info |
| GET | /api/my-usage |
Token usage summary |
| GET | /api/my-stats/daily?days=30 |
Daily usage chart data |
| POST | /api/change-password |
Change password |
| POST | /api/regenerate-key |
Regenerate API key |
Admin only:
| Method | Path | Description |
|---|---|---|
| GET | /api/admin/users |
List all users |
| POST | /api/admin/users |
Create user |
| DELETE | /api/admin/users/{id} |
Delete user |
| POST | /api/admin/users/{id}/quota |
Set token quota |
| GET | /api/admin/upstreams |
List upstream configs |
| POST | /api/admin/upstreams |
Create upstream |
| PUT | /api/admin/upstreams/{id} |
Update upstream |
| DELETE | /api/admin/upstreams/{id} |
Delete upstream |
The official Xiaomi MiMo guide: passing-back-reasoning_content.
Key rules enforced by providers:
- Every assistant turn produced in thinking mode must appear in later
messages[]with areasoning_contentfield. - The string must match what the API returned earlier.
- Tool-call assistant messages also need
reasoning_content. - Empty reasoning: the field must still be present (empty string
"").
The adapter caches reasoning_content from upstream responses and restores it
onto matching assistant messages in subsequent requests. It never fabricates
reasoning text.
Every HTTP request logs: client IP, User-Agent, method, path, status, elapsed time. Chat completions additionally log: routing decision, message summaries, reasoning round-trip diagnostics, and upstream status.
Set LOG_COLOR=auto (default) for colored terminal output, always or never
to override.
- Cache is in-memory only. Restarting the proxy clears remembered reasoning.
- Matching is conservative — if Cursor rewrites assistant content or tool calls significantly, the proxy will not guess.
- Streaming tool call reconstruction handles standard OpenAI delta shapes.
MIT