2 cpu core 100% on idle

I have latest vLLM docker and some times when idle 2 cpu cores are 100%

How to fix this?
–sleep-on-idle `?

To reduce CPU usage when vLLM is idle, set the environment variable VLLM_SLEEP_WHEN_IDLE=1 in your Docker environment. This will lower CPU consumption during inactivity, though it may add a small latency (~100ms) when new requests arrive. The --sleep-on-idle CLI option is not present in the latest official releases; only the environment variable is supported according to the documentation and merged PRs.

Example Docker usage:

docker run -e VLLM_SLEEP_WHEN_IDLE=1 ...

See vLLM environment variables and PR #16226 for details.

Would you like more detail on how this works or how to apply it in your setup?

Sources: