This folder contains a lightweight Node.js reverse proxy server that forwards model-specific prompt payloads to Gemini, GitHub Models, OpenRouter, and DeepSeek.
The server also serves static files from this folder for GET requests.
This repo now includes a phased port toward dual runtime support:
- Node runtime for local development and DigitalOcean App Platform
- Cloudflare Worker runtime adapter
Implementation phases, scope, and checkpoints are documented in:
DUAL_RUNTIME_PORT_PLAN.md
Current branch-level implementation status:
- Phase 1 complete: shared runtime-agnostic core modules under
src/core/ - Phase 2 complete: Node adapter (
src/node/server.js) and Worker adapter (src/worker/worker.js) - Phase 3 complete:
wrangler.tomlhasdev(default) andproductionenvironments with separate worker names and routes - Phase 4 complete: full test matrix documented below
- Accepts
POSTrequests from browser or test clients - Forwards provider-shaped request bodies upstream
- Normalizes successful responses to
{ "text": "..." } - Normalizes upstream errors to
{ "error": "..." } - Handles CORS for browser clients
- Node.js
>=18.0.0
The project uses the built-in node:test runner and ESM imports, so older Node releases will fail to start the test suite.
The proxy uses these values:
GEMINI_API_KEYGITHUB_TOKENOPENROUTER_API_KEYDEEPSEEK_API_KEYPORToptional, defaults to3000GEMINI_MODELGemini model name used to build the Gemini upstream URLGEMINI_URLrequired Gemini upstream URL, typically defined in terms ofGEMINI_MODELandGEMINI_API_KEYGH_URLrequired GitHub Models upstream URLOPENROUTER_URLrequired OpenRouter chat-completions URLDEEPSEEK_URLrequired DeepSeek chat-completions URL
start.sh sources .env if it exists (fallback to legacy .env.modelspecs), exports the variables above, prompts for missing API keys, and starts server.cjs.
Use .env.modelspecs.example as the tracked template, then create your local .env with real credentials.
Template file:
cp .env.modelspecs.example .envExample .env.modelspecs.example:
GEMINI_API_KEY=your_gemini_key
DEEPSEEK_API_KEY=your_deepseek_key
OPENROUTER_API_KEY=your_openrouter_key
GITHUB_TOKEN=your_github_token
GEMINI_MODEL="gemini-2.5-flash"
GEMINI_URL="https://generativelanguage.googleapis.com/v1beta/models/${GEMINI_MODEL}:generateContent?key=${GEMINI_API_KEY}"
GH_URL="https://models.inference.ai.azure.com/chat/completions"
DEEPSEEK_URL="https://api.deepseek.com/chat/completions"
OPENROUTER_URL="https://openrouter.ai/api/v1/chat/completions"
PORT=3000With that layout, the Gemini model name appears only once in the file, and GEMINI_URL is derived from it when the file is sourced by bash.
.env should stay untracked and contain your real keys. .env.modelspecs.example is the safe file to commit and document.
There are two ways to start the server locally. Both end up running node server.cjs — they differ only in how environment variables are supplied.
start.sh sources .env if it exists (fallback to .env.modelspecs), exports all required variables, and prompts interactively for any that are still missing (e.g. API keys). This is the easiest path when you do not want to export variables manually.
Run it directly:
bash start.shOr via npm:
npm run start:devIf you want an explicit Node-side check before starting, use:
npm run node:check
npm run node:devThe Node helper reads .node.local.env when present for machine-specific settings that should not be committed.
If your environment variables are already exported in your shell session (e.g. via your shell profile or a separate env tool), you can start the server directly without the shell script:
npm startThe server will fail to reach upstream providers if the required variables are not already set — there is no interactive prompt in this path.
The local machine needs all of the following before the Node helper can run cleanly:
nodeavailable on PATH.envwith provider credentials- optional
.node.local.envwith Node/DO machine-specific overrides
The Worker adapter uses the same API route contract as Node for POST endpoints. Static file serving is not supported in the Worker — it handles API routes only.
Before using the Worker on a machine, run npx wrangler login once so Wrangler can authenticate with Cloudflare. The committed files intentionally stay generic; Cloudflare names, routes, account IDs, and zone details live in the untracked .worker.local.env file.
wrangler.toml is a public template. The restore script renders the actual Cloudflare values from .worker.local.env.
| Environment | Worker name | Custom domain | Command |
|---|---|---|---|
| staging (default) | CF_STAGING_WORKER_NAME |
CF_STAGING_CUSTOM_DOMAIN |
npm run worker:staging:deploy |
| production | CF_PROD_WORKER_NAME |
CF_PROD_CUSTOM_DOMAIN |
npm run worker:prod:deploy |
When you decide to cut over the public production domain to the Worker, update the values in your local
.worker.local.envfile, rerunnpm run worker:prod:deploy, and change the DNS CNAME. No committed file needs to change for that cutover.
Run once per environment. Secrets are stored in Cloudflare and never in wrangler.toml.
# Staging secrets (default)
npx wrangler secret put GEMINI_API_KEY --env staging
npx wrangler secret put GITHUB_TOKEN --env staging
npx wrangler secret put OPENROUTER_API_KEY
npx wrangler secret put DEEPSEEK_API_KEY
# Production secrets
npx wrangler secret put GEMINI_API_KEY -e production
npx wrangler secret put GITHUB_TOKEN -e production
npx wrangler secret put OPENROUTER_API_KEY -e production
npx wrangler secret put DEEPSEEK_API_KEY -e productionnpm run worker:devWrangler runs the Worker on http://127.0.0.1:8787 using a local runtime emulation layer. The script reads .env for API credentials and .worker.local.env for Cloudflare-specific names/routes.
The local machine needs all of the following before this can work:
npx wrangler login.envwith provider credentials.worker.local.envwith Cloudflare account, zone, worker names, and routes- DNS access to the relevant Cloudflare zone if you want the custom domain to resolve
# Deploy to staging environment
npm run worker:staging:deploy
# Deploy to production environment
npm run worker:prod:deployDeployments are automated via GitHub Actions in .github/workflows/deploy-workers.yml.
- Push to
stagingbranch deploysaiproxy-staging - Push to
mainbranch deploysaiproxy - Manual trigger is available in Actions via
workflow_dispatch
Set these repository settings before enabling the workflow:
- GitHub Secret:
CLOUDFLARE_API_TOKEN - GitHub Variable:
CF_ACCOUNT_ID - GitHub Variable:
CF_ZONE_NAME - GitHub Variable:
CF_STAGING_WORKER_NAME(example:aiproxy-staging) - GitHub Variable:
CF_STAGING_WORKER_ROUTE(example:aiproxy-staging.numerus.app/*) - GitHub Variable:
CF_PROD_WORKER_NAME(example:aiproxy) - GitHub Variable:
CF_PROD_WORKER_ROUTE(example:aiproxy-worker.numerus.app/*)
Optional Slack notifications:
- GitHub Secret:
SLACK_DEPLOY_WEBHOOK_URL
If SLACK_DEPLOY_WEBHOOK_URL is set, the workflow posts both success and failure deployment notifications to Slack.
Cloudflare route bindings in wrangler.toml do not create DNS records automatically. Each hostname needs a CNAME in your DNS zone:
| Name | Target | Proxy |
|---|---|---|
CF_STAGING_CUSTOM_DOMAIN host |
CF_STAGING_WORKER_NAME.workers.dev |
Proxied |
CF_PROD_CUSTOM_DOMAIN host |
CF_PROD_WORKER_NAME.workers.dev |
Proxied |
http://localhost:3000
POST /api/gempromptPOST /api/ghpromptPOST /api/orpromptPOST /api/dsprompt
The proxy forwards the request body you send as-is to the upstream provider. It does not accept a generic { prompt, systemPrompt } body.
Clients should therefore send provider-shaped JSON:
- Gemini clients send a Gemini
generateContentstyle payload - GitHub Models clients send a chat-completions style payload
- OpenRouter clients send a chat-completions style payload
- DeepSeek clients send a chat-completions style payload
Current examples live in these files:
Successful responses are normalized to:
{ "text": "...model output..." }Errors are normalized to:
{ "error": "...message..." }The test suite has two layers and can target any runtime via TEST_BASE_URL.
| Layer | What it tests | Command | Prerequisites |
|---|---|---|---|
| Unit | Local logic, retry, history rollback | npm test |
None |
| Live — Node local | Full proxy via local Node server | npm run test:live |
npm run start:dev running |
| Live — Worker local | Full proxy via wrangler emulation | TEST_BASE_URL=http://127.0.0.1:8787 npm run test:live |
npm run worker:dev running |
| Live — CF dev | Full proxy via deployed dev Worker | TEST_BASE_URL=https://<CF_DEV_CUSTOM_DOMAIN> npm run test:live |
Worker deployed, DNS live |
| Live — CF prod | Full proxy via deployed prod Worker | TEST_BASE_URL=https://<CF_PROD_CUSTOM_DOMAIN> npm run test:live |
Worker deployed, DNS live |
| Live — DO App | Full proxy via deployed DO App | TEST_BASE_URL=https://<DO_APP_URL> npm run test:live |
DO App running |
| All local | Unit + Node live together | npm run test:all |
npm run start:dev running |
npm testNo network required. Covers retry behavior, empty-response handling, rollback of failed user turns, and conversation-history updates. CI-safe.
# Node local (requires: npm run start:dev in another terminal)
npm run test:live
# Worker local emulation (requires: npm run worker:dev in another terminal)
TEST_BASE_URL=http://127.0.0.1:8787 npm run test:live
# Deployed Cloudflare dev Worker
TEST_BASE_URL=https://<CF_DEV_CUSTOM_DOMAIN> npm run test:live
# Deployed Cloudflare production Worker
TEST_BASE_URL=https://<CF_PROD_CUSTOM_DOMAIN> npm run test:liveTests can skip individual providers when an upstream returns a transient overload (e.g. Gemini high demand).
npm run test:allThe original prompt test mixed two separate concerns:
- local request and conversation-state logic
- real external provider availability
That made failures ambiguous. A red test could mean broken local code, a stopped proxy, missing credentials, or upstream overload.
The split makes failures easier to interpret:
- unit tests answer: did local code break?
- live tests answer: does the full external system work right now?