If you cannot update your CUDA driver, you may be able to use vLLM’s CUDA compatibility libraries (cuda-compat) as a workaround, but this only works on select professional/datacenter NVIDIA GPUs. For Docker, set the environment variable VLLM_ENABLE_CUDA_COMPATIBILITY=1 when running the container. For non-Docker environments, install the appropriate cuda-compat package (e.g., cuda-compat-12-9), set VLLM_ENABLE_CUDA_COMPATIBILITY=1, and set VLLM_CUDA_COMPATIBILITY_PATH to the compat library path. This does not work for all GPUs and may not be supported on consumer cards. If your GPU/driver is too old (e.g., CUDA < 11.8), there is no supported workaround—vLLM requires a recent CUDA toolkit and driver for both prebuilt wheels and source builds. See official troubleshooting and installation docs.
Would you like more detail or step-by-step instructions for your specific environment?
Sources:
- https://docs.vllm.ai/en/latest/usage/troubleshooting/#cuda-error-the-provided-ptx-was-compiled-with-an-unsupported-toolchain
- https://docs.vllm.ai/en/latest/getting_started/installation/gpu/#running-on-systems-with-older-cuda-drivers
- https://github.com/vllm-project/vllm/issues/1907
Hint: Mention @RunLLM in the post for followups.