I have latest vLLM docker and some times when idle 2 cpu cores are 100%
How to fix this?
–sleep-on-idle `?
I have latest vLLM docker and some times when idle 2 cpu cores are 100%
How to fix this?
–sleep-on-idle `?
To reduce CPU usage when vLLM is idle, set the environment variable VLLM_SLEEP_WHEN_IDLE=1 in your Docker environment. This will lower CPU consumption during inactivity, though it may add a small latency (~100ms) when new requests arrive. The --sleep-on-idle CLI option is not present in the latest official releases; only the environment variable is supported according to the documentation and merged PRs.
Example Docker usage:
docker run -e VLLM_SLEEP_WHEN_IDLE=1 ...
See vLLM environment variables and PR #16226 for details.
Would you like more detail on how this works or how to apply it in your setup?
Sources: