No Received request after vllm v0.10.2

No Received request after vllm v0.10.2.

If you are not seeing “Received request” logs after upgrading to vLLM v0.10.2, it may indicate the server is not processing incoming requests or is stuck. This can be caused by issues such as network misconfiguration, model loading problems, or the server hanging due to resource constraints. Enabling more logging with export VLLM_LOGGING_LEVEL=DEBUG and checking for errors during startup or request handling is recommended. Also, ensure your requests are correctly formatted and sent to the right endpoint, as vLLM’s API paths and expected payloads have not changed in v0.10.2 according to the available sources.

If the server appears healthy but does not process requests, try restarting it, check for out-of-memory errors, and verify that the model is fully loaded. For further debugging, consult the troubleshooting guide and consider using additional environment variables like CUDA_LAUNCH_BLOCKING=1 or NCCL_DEBUG=TRACE to identify where the process may be stuck. If the issue persists, review recent GitHub issues for similar reports or file a new issue with detailed logs.

Would you like a step-by-step checklist to diagnose this further?

Sources:

Seems no log for “Received request“ since v0.10.2 ? @qt2139

vllm now defaults to disable log requests, c.f. Engine Arguments - vLLM .

1 Like