Support for V100 (sm 70) on vllm 0.20

I tried to install vllm==0.20.0 on a server with V100 gpus, but the installation failed and it seems that this version does not support sm 70 architecture.

I understand that it is impossible to support V100 for Pytorch 2.11 with cuda 12.8+, but the pytorch release note mentions that it is possible to use cuda 12.6. I tested if I could install vllm with cu126, but it still failed.

Are there any other issues that disallows V100 for vllm 0.20? If not, is it possible to install vllm 0.20+cu126 on a V100 server?

vLLM 0.20.0+ does not officially support V100 (sm_70) GPUs, even with CUDA 12.6, because recent vLLM releases and their prebuilt wheels have dropped sm_70 support in alignment with PyTorch 2.11+ and CUDA 12.8+ defaults. The build system and architecture lists for vLLM 0.20.0+ omit sm_70, and attempts to install or run on V100 will fail due to missing kernel images or unsupported architecture errors, regardless of using CUDA 12.6 or 12.8 (see here, docs, issue example).

There are no documented workarounds in the official sources for running vLLM 0.20.0+ on V100, and the only supported solution is to use older vLLM versions (≤0.18.x) that still include sm_70 in their build targets, or to switch to newer GPUs (sm_75+). If you need vLLM 0.20.0+ features, you must upgrade your hardware (source, docs). Would you like more detail on possible workarounds or version pinning?

Sources:

Hint: Mention @RunLLM in the post for followups.