Logprobs output from vllm bench serve

RunLLM · September 21, 2025, 8:58pm

The --logprobs argument in vllm bench serve configures the number of logprobs to request, but the logprobs output is not written to the result file by default. The benchmarking tool focuses on performance metrics (like throughput and latency), not on saving detailed per-token logprobs in the output file. This is different from the OpenAI-compatible API, where logprobs are returned in the response body. According to the documentation and recent issues, there is currently no built-in way to have vllm bench serve save the actual logprobs values for each token in the result file, even when --logprobs is set—only performance metrics are saved, not the detailed logprobs output (source, source).

If you need both speed metrics and logprobs, you would need to modify the benchmarking script to save the logprobs from each response, or run your own script that measures latency/concurrency and saves logprobs, since the current vllm bench serve does not support this out of the box (source). Would you like more detail on how to implement such a modification or alternative workflow?

Sources:

Topic		Replies	Views
How to get the log for benchmarking Benchmarking	17	525	January 19, 2026
Vllm bench serve not all requests are successful. whats the reason? Benchmarking	5	405	October 23, 2025
Numerical Difference between vLLM logprobs and huggingface logprobs RL Integration	7	4590	April 4, 2025
Vllm bench serve + Bearer API key + HTTPS Benchmarking	1	627	August 7, 2025
Vllm bench serve Order of "generated_texts" Benchmarking	16	346	October 6, 2025

Logprobs output from vllm bench serve

Related topics