Hello! When I run the following benchmarking command, the logprobs
output is not generated anywhere. I cannot see it in the result file either. This is the case whether --use-beam-search
is used or not. According to the documentation, the --logprobs
is a valid option. Would you please let me know where I can find the logprobs
output.
nohup vllm serve Qwen/Qwen2.5-0.5B-Instruct > vllm.log 2>&1 &
vllm bench serve \
--dataset-name custom \
--dataset-path dataset.jsonl \
--num-prompts 1024 \
--logprobs 1 \
--model Qwen/Qwen2.5-0.5B-Instruct \
--save-result --save-detailed \
--result-filename result.json
Please note that I am aware of a couple of ways to generate logprobs
elsewhere. For example, this commands works well.
curl -X POST "http://127.0.0.1:8000/v1/completions" \
-H "Content-Type: application/json" \
-d '{
"model": "Qwen/Qwen2.5-0.5B-Instruct",
"prompt": "Hello, how are you?",
"max_tokens": 10,
"logprobs": 1,
}'
However, the curl
command (or the equivalent Python) would not calculate speed metrics or take aspects like concurrency into consideration. That is why I want to use vllm bench serve
so that I do not have to evaluate each dataset twice.
So, my question is, can I obtain the output of logprobs
from vllm bench serve
? I am okay with any approach that calculates both metrics (including concurrency) and logprobs. Thanks!