The vllm version is 0.10.1. The v0 benchmark can be successfully executed, but the v1 request will get stuck. When the client gets stuck, it will time out. The benchmark log is provided below.
Traceback (most recent call last):
File "/opt/ac2/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 539, in start
message, payload = await protocol.read() # type: ignore[union-attr]
^^^^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/aiohttp/streams.py", line 680, in read
await self._waiter
asyncio.exceptions.CancelledError
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/mnt/ysgg1/vllm_split/scripts/benchmark/backend_request_func.py", line 188, in async_request_openai
async with session.post(url=api_url, json=payload, headers=headers) as response:
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/aiohttp/client.py", line 1517, in __aenter__
self._resp: _RetType = await self._coro
^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/aiohttp/client.py", line 786, in _request
resp = await handler(req)
^^^^^^^^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/aiohttp/client.py", line 764, in _connect_and_send_request
await resp.start(conn)
File "/opt/ac2/lib/python3.12/site-packages/aiohttp/client_reqrep.py", line 534, in start
with self._timer:
^^^^^^^^^^^
File "/opt/ac2/lib/python3.12/site-packages/aiohttp/helpers.py", line 713, in __exit__
raise asyncio.TimeoutError from exc_val
TimeoutError
[5m54s < 40m35s] Progress: 127/1000 (1 failed) (12.7%), Decoding: 0, Prefilling: 38, TTFT: 0.00 (15779.53), ITL: 21.37 (21.37), Decoding throughput: 0.00 (273.82), Prefilling throughput: 0.00 (67.14)
The log of the scheduler is below.
finish serving request: deaec2cd-7a00-4d3c-9c6e-c41024ec6b85
connection of request: 289076dc-e996-4cbd-ad6e-8c5ea911e404, scheduler request: e8420178-1460-45d3-9ea0-3c85e27a2f75 closed without finish
finish prefill stage of request[abort] or some wrong with input parameter : e8420178-1460-45d3-9ea0-3c85e27a2f75
finish serving request: e8420178-1460-45d3-9ea0-3c85e27a2f75
http: proxy error: context canceled
The server’s log did not report any errors.