Why using devel image to build final vLLM docker image?

RunLLM · August 17, 2025, 2:50pm

Using the CUDA “devel” image is necessary for building vLLM from source because it includes development tools like nvcc (the CUDA compiler), which are not present in the “runtime” image. The “runtime” image is smaller but lacks the build tools required for compiling vLLM and its dependencies; using it will break the build process, not just reduce the image size. Once built, you could theoretically copy only the runtime artifacts into a new image based on the “runtime” image (multi-stage build), but this is not the default approach in vLLM’s official Dockerfiles. Performance at inference time is not affected by the base image, as long as all required runtime libraries are present—it’s a matter of build compatibility, not runtime speed or efficiency. According to this discussion and this issue, the main reason for the large image size is the inclusion of CUDA, PyTorch, and multi-arch support, not just the choice of base image.

Would you like more detail on how to use multi-stage builds to further reduce the image size?

Sources:

Topic		Replies	Views
Current vLLM docker image size is 12.64Gb, how to reduce it? General	3	195	July 28, 2025
How to get the dev version vllm docker image? General	2	48	April 8, 2025
How to build vllm docker image for different CUDA version General	1	105	September 5, 2025
Build VLLM without installing gcc? General	1	317	June 20, 2025
vLLM docker image For MAC CPU Silicon General	21	327	September 19, 2025

Why using devel image to build final vLLM docker image?

Related topics