Validate chat completions logprobs parameters#751
Conversation
|
Warning You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again! |
|
Review: Validate chat completions logprobs parameters Clean, well-scoped change. Verified that logprobs / top_logprobs correctly land in the flattened extra map (not typed fields on ChatCompletionRequest), validate() is invoked on every completions route, and the new tests cover the boundary cases (0, 20, minus-1, 21, non-integer, single-string stop, 4 vs 5 sequences). No critical issues. Minor (non-blocking) notes:
Neither blocks merge; the stricter behavior is defensible at the API boundary. Approved (no critical issues found). |
7868dd6 to
26ff1f1
Compare
PierreLeGuen
left a comment
There was a problem hiding this comment.
crates/api/src/models.rs:1097: the new validation only runs when top_logprobs is present, so malformed logprobs by itself still passes ChatCompletionRequest::validate() (for example {"logprobs":"true"} or {"logprobs":1}). Because logprobs is captured by the flattened extra map and forwarded through convert_chat_request_to_service, those invalid requests still reach providers and get provider-dependent failures or behavior instead of the local invalid_request_error this PR is adding. Please validate extra["logprobs"] independently as a boolean (allowing null/unset if desired) and add a regression test for invalid logprobs without top_logprobs.
Local checks run:
cargo test -p api models::tests::test_chat_completion_top_logprobs --libcargo test -p api models::tests::test_chat_completion_stop_array_may_contain_at_most_four_sequences --libcargo fmt --check
|
Addressed in 2a8f764: |
PierreLeGuen
left a comment
There was a problem hiding this comment.
Reviewed the validation changes in crates/api/src/models.rs, the chat completions handler/conversion path, and the provider forwarding path for flattened extras. The previous standalone logprobs gap is covered now, and the stop/top_logprobs validation is exercised by focused tests.
Local checks run:
- cargo test -p api --lib chat_completion_
- cargo fmt --all -- --check
GitHub checks were green for lint, test suite, and security audit on head 2a8f764.
Evrard-Nil
left a comment
There was a problem hiding this comment.
🤖 Reviewed independently at 2a8f764 (built, cargo test -p api --lib → 142 pass incl. the 7 new tests; fmt/clippy clean). Logprobs validation is correct — approving, with one thing to disclose.
The logprobs logic matches the OpenAI chat-completions contract exactly (validated in ChatCompletionRequest::validate(), reading from the #[serde(flatten)] extra map, which is the right source):
top_logprobspresent withoutlogprobs:true→ 400 (covers both absent andfalse) ✓top_logprobsrange 0–20 exact, no off-by-one (!(0..=20).contains(...)) ✓;logprobs:truealone allowed;top_logprobs:0+trueallowed; negatives/21rejected ✓- non-boolean
logprobs/ non-integertop_logprobs→ 400;nulltreated as unset ✓ - Returns 400
invalid_request_errorbefore forwarding (completions.rs:1102), correctly scoped to chat completions only (legacy/v1/completionsint-logprobs path untouched), no-op when unset, streaming + non-streaming both covered. Well tested.
stop-array validation (models.rs:1091-1095: "stop array may contain at most 4 sequences") not mentioned in the title or description. It's OpenAI-spec-correct (stop ≤ 4), but it's a real behavior change riding in a "logprobs" PR: chat requests with 5+ stop strings will now get a 400 where they previously passed through to the backend — and self-hosted vLLM/SGLang do not enforce a 4-stop cap, so this newly rejects requests those engines would have served. Please (a) call it out in the PR description, and (b) confirm enforcing the OpenAI 4-cap is intended for our self-hosted models (vs. letting the backend decide). Clear 400 rather than silent breakage, so not blocking — but reviewers approving "logprobs" shouldn't miss it.
NITs: float top_logprobs like 2.0 is rejected as non-integer (spec-compatible); no unit round-trip test for the valid happy path (covered by the existing e2e). Approving on the logprobs correctness; please address the stop-array disclosure.
Problem
POST /v1/chat/completionsaccepted invalid OpenAI sampling parameters and sent them to inference:top_logprobswithoutlogprobs: truetop_logprobsoutside the OpenAI range0..=20stoparrays with more than 4 sequencesThat masks client bugs and can spend inference on requests that should be rejected at the API boundary.
Fixes #613
Change
top_logprobsfrom the flattened requestextramap.logprobs: truewhentop_logprobsis present.top_logprobsvalues and values outside0..=20.top_logprobs: nullas unset, matching OpenAI-style optional field behavior.stoparrays with more than 4 sequences while preserving single-stringstopand arrays of up to 4 entries.ChatCompletionRequest::validateregression tests for the invalid and valid cases.What Else Is Needed
Tests
cargo test -p api --lib chat_completion_cargo fmt --all -- --checkcargo clippy --all-targets --all-features -- -D warningscargo test --lib --binsNot Run
cargo test -p api chat_completion_also selected e2e tests and failed locally because Postgres was not running (Connection refusedfromdb_setup.rs). The scoped API library test above covers this change without requiring the e2e database.