cursor-reasoning-adapter

中文文档

Minimal OpenAI-compatible HTTP proxy for providers that require reasoning_content to be round-tripped in thinking mode — with built-in user authentication, token quotas, and a web admin UI.

Why?

Cursor currently omits reasoning_content when replaying previous assistant messages to an OpenAI-compatible provider. Some providers reject that request with:

Provider returned error: {"error":{"code":"400","message":"Param Incorrect","param":"The reasoning_content in the thinking mode must be passed back to the API.","type":""}}

This proxy keeps a structured JSON cache of assistant messages returned by the upstream provider. When a later /v1/chat/completions request contains the same assistant message without reasoning_content, the proxy restores the cached field before forwarding the request upstream.

Deployment Architecture

This proxy must be deployed on a server with a public IP address (or a domain accessible from the internet).

Cursor does not make API calls from your local machine. When you configure a custom API provider in Cursor, Cursor's cloud servers send the requests to your endpoint — your local Cursor client is just a UI. This means:

localhost or 127.0.0.1 will not work — Cursor's servers can't reach your local machine.
The proxy must listen on a public IP or be accessible via a domain name (e.g., behind Nginx/Caddy with HTTPS).
Make sure the port (default 8787) is open in your server's firewall/security group.

┌──────────────┐       ┌─────────────────────┐       ┌──────────────┐
│  Cursor IDE  │──────▶│  Cursor Cloud Server │──────▶│  This Proxy  │
│  (local app) │       │  (makes API calls)   │       │  (your VPS)  │
└──────────────┘       └─────────────────────┘       └──────┬───────┘
                                                            │
                                                            ▼
                                                   ┌──────────────┐
                                                   │   Upstream   │
                                                   │  Provider    │
                                                   └──────────────┘

Recommended setup:

# Example: deploy on a VPS with public IP
# Option A: direct access
listen_addr = "0.0.0.0:8787"

# Option B: behind Nginx/Caddy reverse proxy with HTTPS
# Nginx: proxy_pass http://127.0.0.1:8787;
# Caddy: reverse_proxy localhost:8787

Then in Cursor, set Base URL to your public address: https://your-domain.com/v1 or http://your-server-ip:8787/v1.

Quick Start

# Clone & build
git clone <repo-url> && cd cursor-reasoning-adapter
cargo build --release

# Copy example config
cp config.example.toml config.toml
# Edit config.toml — set your upstream base_url and api_keys

# Run
RUST_LOG=info ./target/release/cursor-reasoning-adapter

# Or use run.sh for daemon mode
chmod +x run.sh
./run.sh start
./run.sh log     # tail logs
./run.sh status  # check if running
./run.sh stop

Health check:

curl http://127.0.0.1:8787/healthz

Configuration

Two configuration methods are supported: a TOML config file and environment variables. Both can be used together — environment variables always take priority over values in the TOML file.

Config file (`config.toml`)

Place a config.toml in the project root (next to Cargo.toml). See config.example.toml for a full annotated example. Set the CONFIG_PATH environment variable to use a different path.

[server]
listen_addr = "0.0.0.0:8787"

[logging]
body_max_chars = 4096
color = "auto"

[[upstreams]]
name = "xiaomi"
base_url = "https://api.xiaomimimo.com"
api_keys = ["tp-your-key-1", "tp-your-key-2"]

[[upstreams.model_rewrites]]
from = "gpt-5.3-codex"
to = "mimo-v2.5-pro"

[[upstreams]]
name = "openai"
base_url = "https://api.openai.com"
api_keys = ["sk-your-openai-key"]
models = ["gpt-4", "gpt-3.5-turbo"]

Environment variables

Variable	Required	Default
`UPSTREAM_BASE_URL`	yes (or use `[[upstreams]]`)	—
`UPSTREAM_API_KEY`	no	—
`UPSTREAM_CHAT_COMPLETIONS_PATH`	no	`/v1/chat/completions`
`LISTEN_ADDR`	no	`127.0.0.1:8787`
`LOG_BODY_MAX_CHARS`	no	`4096`
`MESSAGE_LOG_MAX_CHARS`	no	`2000`
`LOG_COLOR`	no	`auto`
`CONFIG_PATH`	no	`./config.toml`
`DATABASE_URL`	no	`sqlite:data/adapter.db?mode=rwc`
`JWT_SECRET`	no	built-in default
`ALLOW_REGISTRATION`	no	`true`
`RUST_LOG`	no	—

Note: If UPSTREAM_BASE_URL is set, it overrides all [[upstreams]] entries.

Priority rules

UPSTREAM_BASE_URL env var creates a single upstream, overriding TOML.
Otherwise, [[upstreams]] entries from TOML are used.
Legacy [upstream] section is converted to an additional upstream entry.
Logging and server settings: env vars override TOML values.

Cursor Setup

In Cursor Settings → Models → OpenAI API Key, configure:

Base URL:  https://your-public-domain.com/v1
API Key:   sk-xxxx-xxxx   (system key)
   — or —
API Key:   tp-xxxx-xxxx   (upstream provider key)

Important: The Base URL must be reachable from the internet — Cursor's cloud servers make the actual API call, not your local machine. localhost and 127.0.0.1 will not work.

Authentication — Two Types of API Keys

The adapter supports two types of API keys simultaneously:

1. System-generated key (`sk-xxx`)

Created when a user registers or is created by an admin
Full feature set: token quota enforcement, usage tracking, prompt audit
Ideal for team distribution — each user gets their own key and quota
The adapter uses its router to forward requests to upstream with upstream's own API keys (key rotation, retry on 401/403)

2. Upstream provider key (`tp-xxx`, `sk-xxx`, etc.)

The raw API key from your upstream provider (e.g., Xiaomi, OpenAI, Deepseek)
Passthrough mode: the adapter recognizes it matches an upstream key in the database and forwards it directly — no key rotation
No quota enforcement, no per-user audit — unlimited usage
Ideal for personal use

Flow diagram

Cursor request → Bearer token
  ├─ matches sk-xxx (system key)?
  │    → look up user → check quota → route via upstream key rotation
  ├─ matches JWT?
  │    → look up user → same as above
  └─ matches tp-xxx (upstream key)?
       → look up upstream → forward directly (no quota, no audit)

Summary: distribute system sk-xxx keys for team members who need quotas and audit; use upstream tp-xxx keys directly for personal / unlimited usage. Both coexist on the same server.

Upstream Routing

Model selection

If only one upstream is configured (no models filter), it handles all requests.
If multiple upstreams exist, they are checked in order. The first upstream whose models list contains the requested model is selected.
If no upstream accepts the model, the adapter returns HTTP 400.

Model rewriting

Transform model names before forwarding to upstream. Useful when Cursor sends one name but the upstream expects another.

Exact match:

[[upstreams.model_rewrites]]
from = "gpt-5.3-codex"
to = "mimo-v2.5-pro"

Regex match with capture groups:

[[upstreams.model_rewrites]]
from_pattern = "^gpt-(.*)$"
to_template = "mimo-{1}"

API key rotation

Each upstream can have multiple API keys. Keys rotate via round-robin. On 401/403, the adapter automatically retries with the next key.

Responses API Compatibility

Cursor may send certain models using the OpenAI Responses API format (input field instead of messages). The adapter auto-detects and converts Responses API payloads to Chat Completions format before forwarding upstream.

Both /v1/chat/completions and /v1/responses endpoints are registered.

User Management & Web UI

Default Admin

On first startup, a default admin account is created:

Username: admin
Password: admin
API key: auto-generated (printed to logs, visible on dashboard)

Web UI

Page	URL	Description
Login	`/`	Username/password login
Register	`/register`	Self-registration
Dashboard	`/dashboard`	Token usage stats, charts, API key management
Admin	`/admin`	User management, quotas, prompt audit

API Endpoints

Public:

Method	Path	Description
POST	`/api/login`	Login, returns JWT
POST	`/api/register`	Register new user
GET	`/api/config`	`{ "allow_registration": bool }`

Authenticated (JWT or API key):

Method	Path	Description
GET	`/api/me`	Current user info
GET	`/api/my-usage`	Token usage summary
GET	`/api/my-stats/daily?days=30`	Daily usage chart data
POST	`/api/change-password`	Change password
POST	`/api/regenerate-key`	Regenerate API key

Admin only:

Method	Path	Description
GET	`/api/admin/users`	List all users
POST	`/api/admin/users`	Create user
DELETE	`/api/admin/users/{id}`	Delete user
POST	`/api/admin/users/{id}/quota`	Set token quota
GET	`/api/admin/upstreams`	List upstream configs
POST	`/api/admin/upstreams`	Create upstream
PUT	`/api/admin/upstreams/{id}`	Update upstream
DELETE	`/api/admin/upstreams/{id}`	Delete upstream

`reasoning_content` — How It Works

The official Xiaomi MiMo guide: passing-back-reasoning_content.

Key rules enforced by providers:

Every assistant turn produced in thinking mode must appear in later messages[] with a reasoning_content field.
The string must match what the API returned earlier.
Tool-call assistant messages also need reasoning_content.
Empty reasoning: the field must still be present (empty string "").

The adapter caches reasoning_content from upstream responses and restores it onto matching assistant messages in subsequent requests. It never fabricates reasoning text.

Access Logs

Every HTTP request logs: client IP, User-Agent, method, path, status, elapsed time. Chat completions additionally log: routing decision, message summaries, reasoning round-trip diagnostics, and upstream status.

Set LOG_COLOR=auto (default) for colored terminal output, always or never to override.

Limitations

Cache is in-memory only. Restarting the proxy clears remembered reasoning.
Matching is conservative — if Cursor rewrites assistant content or tool calls significantly, the proxy will not guess.
Streaming tool call reconstruction handles standard OpenAI delta shapes.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.idea		.idea
data		data
src		src
static		static
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
README_zh.md		README_zh.md
config.example.toml		config.example.toml
run.sh		run.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cursor-reasoning-adapter

Why?

Deployment Architecture

Quick Start

Configuration

Config file (`config.toml`)

Environment variables

Priority rules

Cursor Setup

Authentication — Two Types of API Keys

1. System-generated key (`sk-xxx`)

2. Upstream provider key (`tp-xxx`, `sk-xxx`, etc.)

Flow diagram

Upstream Routing

Model selection

Model rewriting

API key rotation

Responses API Compatibility

User Management & Web UI

Default Admin

Web UI

API Endpoints

`reasoning_content` — How It Works

Access Logs

Limitations

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

cursor-reasoning-adapter

Why?

Deployment Architecture

Quick Start

Configuration

Config file (config.toml)

Environment variables

Priority rules

Cursor Setup

Authentication — Two Types of API Keys

1. System-generated key (sk-xxx)

2. Upstream provider key (tp-xxx, sk-xxx, etc.)

Flow diagram

Upstream Routing

Model selection

Model rewriting

API key rotation

Responses API Compatibility

User Management & Web UI

Default Admin

Web UI

API Endpoints

reasoning_content — How It Works

Access Logs

Limitations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Config file (`config.toml`)

1. System-generated key (`sk-xxx`)

2. Upstream provider key (`tp-xxx`, `sk-xxx`, etc.)

`reasoning_content` — How It Works

Packages