Which version of vLLM supports the best_of parameter in the OpenAI-compatible API?

Dongber · November 26, 2025, 12:39pm

Alternatively, I want to obtain n results sorted by seq_prob from high to low.

Right now I set sample_n, but since vLLM performs independent sampling for each completion, the n results often contain duplicates. I need a way to get the top-n most probable sequences instead of repeated samples.

RunLLM · November 26, 2025, 12:39pm

vLLM’s n parameter in SamplingParams generates n independent samples, which can result in duplicate outputs, as each sample is drawn independently. There is currently no built-in way to directly obtain the top-n most probable (highest sequence probability) unique completions in a single call—vLLM does not support returning the n most likely sequences ranked by probability without duplicates. The best_of parameter can be used to generate more candidates and return the top-n, but it is only supported in vLLM v0 and may not fully prevent duplicates or guarantee unique outputs, and its behavior is not always intuitive or well-documented for this use case. See the discussion in vLLM issue #639 and SamplingParams docs.

If you need strictly the n most probable unique sequences, you would need to implement custom logic outside vLLM: generate a large number of samples (using n or best_of), compute their probabilities (using logprobs), deduplicate, and then select the top-n by probability. This is not natively supported by vLLM as of now. Would you like more detail or a code example for this workaround?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
Why doesn't the parameter n in samplingparams work as expected General	4	218	January 13, 2026
How to obtain the logprob of a specified token in a step? General	10	1196	October 24, 2025
Seqence Parallelism Support - Source Code Location Features	0	40	June 10, 2025
How to obtain the logits of LLM General	31	2082	July 21, 2025
Logprobs output from vllm bench serve Benchmarking	6	183	September 27, 2025

Which version of vLLM supports the best_of parameter in the OpenAI-compatible API?

Related topics