Hello everyone, i tried to install vLLM today by using the installation process of this link, but i got an error.
During the installation, i got this message :
– Found Torch: /home/me/.venv/lib/python3.12/site-packages/torch/lib/libtorch.so
CMake Error at cmake/cpu_extension.cmake:164 (message):
vLLM CPU backend requires AVX512, AVX2, Power9+ ISA, S390X ISA or ARMv8
support.
Call Stack (most recent call first):
CMakeLists.txt:97 (include)
By the way, i would like to use vLLM with the CPU.
The error means your CPU does not support any of the required instruction sets: AVX512, AVX2, Power9+, S390X, or ARMv8. vLLM’s CPU backend requires at least one of these ISAs to build and run; older CPUs without them are not supported for vLLM CPU inference or serving, and there is no workaround documented in the official sources. You can check your CPU’s supported instructions with lscpu | grep -i avx or similar commands.
For more details, see vLLM CPU installation requirements and related GitHub issues, which confirm this requirement and the lack of support for CPUs without these instruction sets. Would you like a step-by-step explanation or help checking your CPU’s capabilities?
If lscpu | grep -i avx returns nothing, your CPU does not support AVX, AVX2, or AVX512 instructions. vLLM requires at least one of these (or Power9+, S390X, or ARMv8) for CPU inference, so your hardware is not compatible with vLLM’s CPU backend and you cannot run vLLM on this machine’s CPU as per the official documentation and error messages. There is no supported workaround for CPUs lacking these instruction sets.
No, there are no versions of vLLM that support CPUs lacking AVX, AVX2, AVX512, Power9+, S390X, or ARMv8 instruction sets. All official vLLM CPU backends require at least one of these ISAs, and there is no documented workaround or legacy version for older CPUs without them, as confirmed in the official documentation and GitHub issues.
Would you like recommendations for alternative LLM inference libraries that support older CPUs?
For older CPUs without AVX/AVX2/AVX512, you can try llama.cpp, which is designed for maximum compatibility and can run on CPUs without advanced vector instructions. It is widely used for LLM inference on legacy hardware and supports various quantization options for efficiency.
Would you like step-by-step instructions for installing and running llama.cpp?