Skip to content

Fix off-by-one error in vLLM adapter vocab_size calculation#458

Open
dmndxld wants to merge 1 commit into
IINemo:mainfrom
dmndxld:fix/vllm-vocab-size-off-by-one
Open

Fix off-by-one error in vLLM adapter vocab_size calculation#458
dmndxld wants to merge 1 commit into
IINemo:mainfrom
dmndxld:fix/vllm-vocab-size-off-by-one

Conversation

@dmndxld

@dmndxld dmndxld commented Apr 22, 2026

Copy link
Copy Markdown

Problem

In WhiteboxModelvLLM.post_processing() (src/lm_polygraph/model_adapters/whitebox_model_vllm.py:115), the log_prob
matrix allocation is based on an incorrect vocab_size:

vocab_size = max(
    self.tokenizer.vocab_size, max(self.tokenizer.added_tokens_decoder.keys())
)

added_tokens_decoder.keys() returns token IDs, not a count. When a model has an added token with ID equal to
vocab_size (e.g., Qwen3 has an added token at ID 151668 while vocab_size is 151668), the matrix is too small. This
causes an IndexError at line 122:

log_prob[i, top_tokens] = top_values  # IndexError: index 151668 is out of bounds for dimension 0 with size 151668

Fix

Convert the max added token ID to a size by adding 1:

max_added_token_id = max(self.tokenizer.added_tokens_decoder.keys()) if self.tokenizer.added_tokens_decoder else 0
vocab_size = max(self.tokenizer.vocab_size, max_added_token_id + 1)

Also adds a guard for empty added_tokens_decoder.

Affected models

Any model whose added token IDs >= tokenizer.vocab_size, such as Qwen3.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant