Codex throwing 404 NotFoundError via LiteLLM Proxy #19928

choi-tommy · 2026-04-28T04:58:57Z

choi-tommy
Apr 28, 2026

Hi everyone,

I'm encountering a persistent 404 Not Found error when trying to use Codex with a custom model via a LiteLLM proxy. It seems like the model identifier is being misread or incorrectly passed during the request.

My Setup:
* Client: Codex (v0.125.0)
* Proxy: LiteLLM (running in Docker) on port 8889
* Backend: vLLM running gemma-4-26B-A4B-it on port 8000

1. LiteLLM Configuration (config.yaml):
I have mapped gpt-5.4 to my vLLM backend and used force_model to ensure it hits the correct Gemma 4 instance.
yaml
model_list:
  - model_name: gpt-5.4
    litellm_params:
      model: openai/model
      api_base: http://192.168.3.54:8000/v1
      api_key: "any"
      stream: true
      drop_params: true
      convert_responses_to_chat: true
      force_model: gemma-4-26B-A4B-it


2. Codex Configuration (.codex/config.toml):
toml
model = "gpt-5.4"
model_provider = "tommy"

[model_providers.tommy]
name = "tommy-model-8889"
base_url = "http://192.168.3.54:8889/v1"
wire_api = "responses"
api_key = "any"


The Error:
When I run Codex and send a simple prompt, I get the following error:

text
root@HomeMini:/app/codex# codex
╭───────────────────────────────────────╮
│ >_ OpenAI Codex (v0.125.0)            │
│                                       │
│ model:     gpt-5.4   /model to change │
│ directory: /app/codex                 │
╰───────────────────────────────────────╯

  Tip: New Build faster with Codex.

› who are you?

■ unexpected status 404 Not Found: litellm.NotFoundError: NotFoundError: OpenAIException - {"error":{"message":"The model model does not exist.","type":"NotFoundError","param":"model","code":404}}. Received Model Group=gpt-5.4
Available Model Group Fallbacks=None, url: http://192.168.3.54:8889/v1/responses


The core issue:
The error message says: The model 'model' does not exist.
It seems Codex is sending a request where the model parameter is literally the string "model" instead of "gpt-5.4", or LiteLLM is failing to map the incoming gpt-5.4 group correctly due to how the request is structured (specifically noting the url: .../v1/responses in the error).

Has anyone encountered this behavior with Codex and LiteLLM? Is there something in the wire_api or the way Codex constructs the OpenAI-compatible request that I need to adjust?

Any help would be greatly appreciated!

Gecko51 · 2026-05-03T20:38:43Z

Gecko51
May 3, 2026

The literal string "model" in the error is the smoking gun. Your LiteLLM config has model: openai/model, which means LiteLLM forwards model=model to the upstream. vLLM has no clue what "model" is, hence the 404. By the way, force_model and convert_responses_to_chat aren't valid keys under litellm_params, they get silently ignored.

Fix the LiteLLM entry to use the actual served model name from vLLM (whatever you passed to --served-model-name, or the HF id if you didn't override):

model_list:
  - model_name: gpt-5.4
    litellm_params:
      model: openai/gemma-4-26B-A4B-it
      api_base: http://192.168.3.54:8000/v1
      api_key: "any"

Then about wire_api = "responses". Codex hits /v1/responses on LiteLLM. LiteLLM does have a Responses bridge, but routing it down to a plain vLLM that only speaks /v1/chat/completions is fragile. Easiest path is to switch Codex to chat completions:

[model_providers.tommy]
name = "tommy-model-8889"
base_url = "http://192.168.3.54:8889/v1"
wire_api = "chat"
api_key = "any"

That makes Codex hit /v1/chat/completions on LiteLLM, which forwards cleanly to vLLM.

Quick sanity check before retesting through Codex:

curl http://192.168.3.54:8889/v1/chat/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer any" \
  -d '{"model":"gpt-5.4","messages":[{"role":"user","content":"hi"}]}'

If that 200s, Codex will work too. If you still get a 404 there, it's the vLLM served-model-name mismatch. Hit curl http://192.168.3.54:8000/v1/models and use whatever id shows up in your LiteLLM entry.

One thing to watch: some of the newer Codex-only models on the OpenAI side refuse chat completions and require /responses. Since you're pointing at a Gemma backend, that doesn't apply, plain chat is fine.

lmk what /v1/models returns if you're still stuck.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Codex throwing 404 NotFoundError via LiteLLM Proxy #19928

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Codex throwing 404 NotFoundError via LiteLLM Proxy #19928

Uh oh!

choi-tommy Apr 28, 2026

Replies: 1 comment

Uh oh!

Gecko51 May 3, 2026

choi-tommy
Apr 28, 2026

Gecko51
May 3, 2026