vLLM install for 5090

Where can I find step by step instructions for installing vLLM for 5090 on Ubuntu 24.04?

Step-by-step instructions for installing vLLM on an RTX 5090 with Ubuntu 24.04 are available in several community discussions and tutorials. The recommended approach is to build vLLM from source with CUDA 12.8 and the latest PyTorch nightly, as official wheels and Docker images may not yet support the 5090 out-of-the-box. A detailed tutorial is provided at this community guide, which covers environment setup, building from source, and running vLLM on the 5090. Additionally, you may need to update NCCL to version 2.26.5 or higher for multi-GPU support, as discussed in this issue thread.

If you want to use Docker, you can follow the build instructions and optimization flags shared in this issue comment, which includes steps for building a compatible Docker image and running vLLM with the correct CUDA architecture flags for Blackwell GPUs. Official support for 5090 is being improved, but for now, building from source or using a custom Docker build is necessary.

Would you like a step-by-step breakdown of the process?

Sources:

1 Like