How to enable the --data-parallel-size function for a single machine?

qt2139 · September 28, 2025, 1:20am

How to enable the --data-parallel-size for a single machine when using vllm CLI?

RunLLM · September 28, 2025, 1:20am

To enable data parallelism on a single machine with vLLM, use the --data-parallel-size argument in your vllm serve command. For example, to run with 4 data parallel ranks (requiring 4 GPUs), use:

vllm serve $MODEL --data-parallel-size 4

This will launch 4 engine processes, each on a separate GPU, and expose a single API endpoint. You can also combine with tensor parallelism (e.g., --data-parallel-size=4 --tensor-parallel-size=2 for 8 GPUs). For more details and multi-node options, see Data Parallel Deployment and Parallelism and Scaling.

Would you like more detail or example multi-node commands?

Sources:

Hint: Mention @RunLLM in the post for followups.

rajesh-s · October 7, 2025, 2:45am

@qt2139 If you are talking about a single host, you need to use –-data-parallel-size-local

hackyroot · October 12, 2025, 6:26am

Why do you want to enable data parallelism for a single machine?

Topic		Replies	Views
How to use data-parallel in vLLM General	3	70	November 24, 2025
"vllm serve" to use all GPU's of a single-node multi-use General	5	612	August 1, 2025
Some question abound Data Parallel examples General	7	331	June 21, 2025
Running vLLM multi-node data parallel with SLURM RL Integration	1	655	August 12, 2025
Command lines to run pp=2 on two nodes in vllm 0.10.x General	2	37	November 20, 2025

How to enable the --data-parallel-size function for a single machine?

Related topics