Need help running VLLM on macOS

whether VLLM can run on MacOS(without docker,just like ollama)?
or usingf docker is must?

vLLM can be installed and built on macOS for development purposes without Docker, but the binaries will not be compiled and won’t work for full inference—only imports and development are supported. For actual model inference and serving, vLLM is fully supported only on Linux, so Docker is generally required for production use on macOS, unlike Ollama which runs natively on macOS. See the official guidance: “vLLM can fully run only on Linux but for development purposes, you can still build it on other systems (for example, macOS), allowing for imports and a more convenient development environment. The binaries will not be compiled and won’t work on non-Linux systems” (vLLM GPU Installation Docs, vLLM Stable GPU Installation Docs).

Recent ARM CPU backend work and community discussions confirm that while you can build and develop on macOS (including Apple Silicon), full inference is not supported natively—Docker or a Linux VM is required for actual serving (GitHub Issue #10658, GitHub PR #9957). Would you like more detail on development workflows or Docker usage for macOS?

Sources:

Hint: Mention @RunLLM in the post for followups.