Skip to content

Hawkzilla/cursor-reasoning-adapter

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cursor-reasoning-adapter

中文文档

Minimal OpenAI-compatible HTTP proxy for providers that require reasoning_content to be round-tripped in thinking mode — with built-in user authentication, token quotas, and a web admin UI.

Why?

Cursor currently omits reasoning_content when replaying previous assistant messages to an OpenAI-compatible provider. Some providers reject that request with:

Provider returned error: {"error":{"code":"400","message":"Param Incorrect","param":"The reasoning_content in the thinking mode must be passed back to the API.","type":""}}

This proxy keeps a structured JSON cache of assistant messages returned by the upstream provider. When a later /v1/chat/completions request contains the same assistant message without reasoning_content, the proxy restores the cached field before forwarding the request upstream.

Deployment Architecture

This proxy must be deployed on a server with a public IP address (or a domain accessible from the internet).

Cursor does not make API calls from your local machine. When you configure a custom API provider in Cursor, Cursor's cloud servers send the requests to your endpoint — your local Cursor client is just a UI. This means:

  • localhost or 127.0.0.1 will not work — Cursor's servers can't reach your local machine.
  • The proxy must listen on a public IP or be accessible via a domain name (e.g., behind Nginx/Caddy with HTTPS).
  • Make sure the port (default 8787) is open in your server's firewall/security group.
┌──────────────┐       ┌─────────────────────┐       ┌──────────────┐
│  Cursor IDE  │──────▶│  Cursor Cloud Server │──────▶│  This Proxy  │
│  (local app) │       │  (makes API calls)   │       │  (your VPS)  │
└──────────────┘       └─────────────────────┘       └──────┬───────┘
                                                            │
                                                            ▼
                                                   ┌──────────────┐
                                                   │   Upstream   │
                                                   │  Provider    │
                                                   └──────────────┘

Recommended setup:

# Example: deploy on a VPS with public IP
# Option A: direct access
listen_addr = "0.0.0.0:8787"

# Option B: behind Nginx/Caddy reverse proxy with HTTPS
# Nginx: proxy_pass http://127.0.0.1:8787;
# Caddy: reverse_proxy localhost:8787

Then in Cursor, set Base URL to your public address: https://your-domain.com/v1 or http://your-server-ip:8787/v1.

Quick Start

# Clone & build
git clone <repo-url> && cd cursor-reasoning-adapter
cargo build --release

# Copy example config
cp config.example.toml config.toml
# Edit config.toml — set your upstream base_url and api_keys

# Run
RUST_LOG=info ./target/release/cursor-reasoning-adapter

# Or use run.sh for daemon mode
chmod +x run.sh
./run.sh start
./run.sh log     # tail logs
./run.sh status  # check if running
./run.sh stop

Health check:

curl http://127.0.0.1:8787/healthz

Configuration

Two configuration methods are supported: a TOML config file and environment variables. Both can be used together — environment variables always take priority over values in the TOML file.

Config file (config.toml)

Place a config.toml in the project root (next to Cargo.toml). See config.example.toml for a full annotated example. Set the CONFIG_PATH environment variable to use a different path.

[server]
listen_addr = "0.0.0.0:8787"

[logging]
body_max_chars = 4096
color = "auto"

[[upstreams]]
name = "xiaomi"
base_url = "https://api.xiaomimimo.com"
api_keys = ["tp-your-key-1", "tp-your-key-2"]

[[upstreams.model_rewrites]]
from = "gpt-5.3-codex"
to = "mimo-v2.5-pro"

[[upstreams]]
name = "openai"
base_url = "https://api.openai.com"
api_keys = ["sk-your-openai-key"]
models = ["gpt-4", "gpt-3.5-turbo"]

Environment variables

Variable Required Default
UPSTREAM_BASE_URL yes (or use [[upstreams]])
UPSTREAM_API_KEY no
UPSTREAM_CHAT_COMPLETIONS_PATH no /v1/chat/completions
LISTEN_ADDR no 127.0.0.1:8787
LOG_BODY_MAX_CHARS no 4096
MESSAGE_LOG_MAX_CHARS no 2000
LOG_COLOR no auto
CONFIG_PATH no ./config.toml
DATABASE_URL no sqlite:data/adapter.db?mode=rwc
JWT_SECRET no built-in default
ALLOW_REGISTRATION no true
RUST_LOG no

Note: If UPSTREAM_BASE_URL is set, it overrides all [[upstreams]] entries.

Priority rules

  1. UPSTREAM_BASE_URL env var creates a single upstream, overriding TOML.
  2. Otherwise, [[upstreams]] entries from TOML are used.
  3. Legacy [upstream] section is converted to an additional upstream entry.
  4. Logging and server settings: env vars override TOML values.

Cursor Setup

In Cursor Settings → Models → OpenAI API Key, configure:

Base URL:  https://your-public-domain.com/v1
API Key:   sk-xxxx-xxxx   (system key)
   — or —
API Key:   tp-xxxx-xxxx   (upstream provider key)

Important: The Base URL must be reachable from the internet — Cursor's cloud servers make the actual API call, not your local machine. localhost and 127.0.0.1 will not work.

Authentication — Two Types of API Keys

The adapter supports two types of API keys simultaneously:

1. System-generated key (sk-xxx)

  • Created when a user registers or is created by an admin
  • Full feature set: token quota enforcement, usage tracking, prompt audit
  • Ideal for team distribution — each user gets their own key and quota
  • The adapter uses its router to forward requests to upstream with upstream's own API keys (key rotation, retry on 401/403)

2. Upstream provider key (tp-xxx, sk-xxx, etc.)

  • The raw API key from your upstream provider (e.g., Xiaomi, OpenAI, Deepseek)
  • Passthrough mode: the adapter recognizes it matches an upstream key in the database and forwards it directly — no key rotation
  • No quota enforcement, no per-user audit — unlimited usage
  • Ideal for personal use

Flow diagram

Cursor request → Bearer token
  ├─ matches sk-xxx (system key)?
  │    → look up user → check quota → route via upstream key rotation
  ├─ matches JWT?
  │    → look up user → same as above
  └─ matches tp-xxx (upstream key)?
       → look up upstream → forward directly (no quota, no audit)

Summary: distribute system sk-xxx keys for team members who need quotas and audit; use upstream tp-xxx keys directly for personal / unlimited usage. Both coexist on the same server.

Upstream Routing

Model selection

  1. If only one upstream is configured (no models filter), it handles all requests.
  2. If multiple upstreams exist, they are checked in order. The first upstream whose models list contains the requested model is selected.
  3. If no upstream accepts the model, the adapter returns HTTP 400.

Model rewriting

Transform model names before forwarding to upstream. Useful when Cursor sends one name but the upstream expects another.

Exact match:

[[upstreams.model_rewrites]]
from = "gpt-5.3-codex"
to = "mimo-v2.5-pro"

Regex match with capture groups:

[[upstreams.model_rewrites]]
from_pattern = "^gpt-(.*)$"
to_template = "mimo-{1}"

API key rotation

Each upstream can have multiple API keys. Keys rotate via round-robin. On 401/403, the adapter automatically retries with the next key.

Responses API Compatibility

Cursor may send certain models using the OpenAI Responses API format (input field instead of messages). The adapter auto-detects and converts Responses API payloads to Chat Completions format before forwarding upstream.

Both /v1/chat/completions and /v1/responses endpoints are registered.

User Management & Web UI

Default Admin

On first startup, a default admin account is created:

  • Username: admin
  • Password: admin
  • API key: auto-generated (printed to logs, visible on dashboard)

Web UI

Page URL Description
Login / Username/password login
Register /register Self-registration
Dashboard /dashboard Token usage stats, charts, API key management
Admin /admin User management, quotas, prompt audit

API Endpoints

Public:

Method Path Description
POST /api/login Login, returns JWT
POST /api/register Register new user
GET /api/config { "allow_registration": bool }

Authenticated (JWT or API key):

Method Path Description
GET /api/me Current user info
GET /api/my-usage Token usage summary
GET /api/my-stats/daily?days=30 Daily usage chart data
POST /api/change-password Change password
POST /api/regenerate-key Regenerate API key

Admin only:

Method Path Description
GET /api/admin/users List all users
POST /api/admin/users Create user
DELETE /api/admin/users/{id} Delete user
POST /api/admin/users/{id}/quota Set token quota
GET /api/admin/upstreams List upstream configs
POST /api/admin/upstreams Create upstream
PUT /api/admin/upstreams/{id} Update upstream
DELETE /api/admin/upstreams/{id} Delete upstream

reasoning_content — How It Works

The official Xiaomi MiMo guide: passing-back-reasoning_content.

Key rules enforced by providers:

  • Every assistant turn produced in thinking mode must appear in later messages[] with a reasoning_content field.
  • The string must match what the API returned earlier.
  • Tool-call assistant messages also need reasoning_content.
  • Empty reasoning: the field must still be present (empty string "").

The adapter caches reasoning_content from upstream responses and restores it onto matching assistant messages in subsequent requests. It never fabricates reasoning text.

Access Logs

Every HTTP request logs: client IP, User-Agent, method, path, status, elapsed time. Chat completions additionally log: routing decision, message summaries, reasoning round-trip diagnostics, and upstream status.

Set LOG_COLOR=auto (default) for colored terminal output, always or never to override.

Limitations

  • Cache is in-memory only. Restarting the proxy clears remembered reasoning.
  • Matching is conservative — if Cursor rewrites assistant content or tool calls significantly, the proxy will not guess.
  • Streaming tool call reconstruction handles standard OpenAI delta shapes.

License

MIT

About

Fix `reasoning_content` 400 errors when using thinking models with Cursor like XIAOMI mimo — deploy once, never worry again.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors