环境描述
vllm version: 0.8.3
vllm-flash-attn version: 2.6.2
情况描述
我在使用vllm进行推理的时候,我输入的内容包含system/user prompt等对话历史messages, 以及两张图片,我的图片格式为base64格式,我发现当我输入的base64格式图片长度低于200的时候,vllm会hang住,无法返回我的请求,这是vllm本身的限制吗?
更多详细信息
具体情况的示例,如下图所示:
卡67在起vllm rollout的服务,卡0~5在训练,卡0向卡67进行通讯传入request,发现卡0以及67的利用率均为0,此时已经hang住了:
输入的request格式如下:
可以看到有一个图片<base64_image_length_104>长度为104,导致vllm hang住
{
"request_number": 98,
"timestamp": "2025-07-01 11:05:51",
"request_size_mb": 13.800942420959473,
"process_pid": 84128,
"cuda_devices": "0,1,",
"request_data": {
"infer_requests": [
{
"messages": [
{
"role": "system",
"content": "A conversation between User and Assistant. The user asks a question, and the Assistant solves it. T...nswer> answer here </answer>\n"
},
{
"role": "user",
"content": "In this UI screenshot, I want to perform the command 'open memo app'.\nPlease think step-by-step to convert this high-level command into a concrete UI action.\nIn the <think> s...box>\n"
},
{
"role": "assistant",
"content": "<think> I need to find and open the Memo app, which is not visible on the current screen. I'll need to swipe through the screens to find it.\n</think> <bbox>[1368, 404, 1496, 559] </bbox>"
},
{
"role": "user",
"content": "<image><image>\nI have provided you with two images:\n1. The first image is the complete original UI screenshot\n2. The second image is the cropped area based on the bounding ....at."
}
],
"images": [
"<base64_image_length_5638616>",
"<base64_image_length_32168>"
],
"audios": [],
"videos": [],
"tools": null,
"objects": {}
},
{
"messages": [
{
"role": "system",
"content": "A conversation between User and Assistant. The user asks a question, and the Assistant solves it. T...e </answer>\n"
},
{
"role": "user",
"content": "In this UI screenshot, I want to perform the command 'open memo app'.\nPlease think step-by-step to convert this high-level command into a concrete UI action...\n"
},
{
"role": "assistant",
"content": "<think> I need to find and open the memo app from the home screen. The memo app is typically represented by icons labeled 'Voice Memos' or 'Note'. I will look for the icon that best matches this description.\n</think> <bbox>[1596, 164, 1727, 283] </bbox>"
},
{
"role": "user",
"content": "<image><image>\nI have provided you with two images:\n1. The first image is the complete original UI screenshot\n2. The second image is the cropped area based on the bounding box [1596, 164, 1727, 283] ...\nPlease strictly follow the format."
}
],
"images": [
"<base64_image_length_5638616>",
"<base64_image_length_11324>"
],
"audios": [],
"videos": [],
"tools": null,
"objects": {}
},
{
"messages": [
{
"role": "system",
"content": "A conversation between User and Assistant. The user asks a question, and the Assistant solves it. The...er here </answer>\n"
},
{
"role": "user",
"content": "In this UI screenshot, I want to perform the command 'more settings'.\nPlease think step-by-step to convert this high-level command into a concrete UI action.\...\n"
},
{
"role": "assistant",
"content": "<think> 用户想要打开更多设置,这通常需要在应用的顶部或侧边导航栏中找到设置或更多选项图标。\n</think> <bbox>[1004, 1250, 1012, 1254] </bbox>"
},
{
"role": "user",
"content": "<image><image>\nI have provided you with two images:\n1. The first image is the complete original UI screenshot\n2. The second image is the cropped area based on the bounding box [1004, 1250, 1012, 1254] you identified in the first turn\n\..."
}
],
"images": [
"<base64_image_length_1568512>",
"<base64_image_length_104>"
],
"audios": [],
"videos": [],
"tools": null,
"objects": {}
},
{
"messages": [
{
"role": "system",
"content": "A conversation between User and Assistant. The user asks a question, and the Assistant solv.... here </answer>\n"
},
{
"role": "user",
"content": "In this UI screenshot, I want to perform the command 'more settings'.\nPlease think step-by-step to convert this high-level command into a concrete UI action.\n...\n"
},
{
"role": "assistant",
"content": "<think> To perform the 'more settings' command, I need to access the application's settings menu. T...</bbox>"
},
{
"role": "user",
"content": "<image><image>\nI have provided you with two images:\n1. The first image is the ...."
}
],
"images": [
"<base64_image_length_1568512>",
"<base64_image_length_692>"
],
"audios": [],
"videos": [],
"tools": null,
"objects": {}
}
],
"request_config": {
"max_tokens": 1024,
"temperature": 1.0,
"top_k": 50,
"top_p": 0.9,
"repetition_penalty": 1.0,
"num_beams": 1,
"stop": [],
"seed": null,
"stream": false,
"logprobs": false,
"top_logprobs": null,
"n": 1,
"best_of": null,
"presence_penalty": 0.0,
"frequency_penalty": 0.0,
"length_penalty": 1.0
},
"metrics": null,
"template": null,
"use_tqdm": false,
"adapter_request": null
}
}