Benchmark error

In the vllm version 0.10.1, when running the benchmark, the “asyncio.exceptions.CancelledError” error was displayed. The scheduler log shows: HTTP: proxy error: context canceled

The error “asyncio.exceptions.CancelledError” during benchmarking in vLLM 0.10.1, along with scheduler logs like “HTTP: proxy error: context canceled,” typically indicates that the client disconnected or canceled the request before the server finished processing. This is a known issue in vLLM where backend request cancellation is not always fully propagated, leading to wasted resources and sometimes memory leaks or instability, especially under load or with proxies in front of vLLM. The vLLM team is aware of this and improvements are being discussed to ensure proper backend cancellation and resource cleanup in future releases. For now, this is not a bug in your setup but a limitation in vLLM’s current cancellation handling logic, especially in async and proxy scenarios. See the discussion in vLLM issue #24584.

Would you like more detail on mitigation strategies or internal handling of cancellations?

Sources:

Hint: Mention @RunLLM in the post for followups.