Can 7900 XTX do batching?

hmellor · August 12, 2025, 9:52pm

With PCIe cards like the 7900 XTX you might find you have better performance with pipeline parallelism instead of tensor parallelism because less data needs to be moved between the GPUs. It’s possible that the end to end latency will be higher though, you’ll have to experiment.

Could you share how you’re sending your parallel requests? And what you see in the vLLM logs while they’re processing?

Topic		Replies	Views
Vllm, combine 7900 xtx and W7900 General	1	304	October 1, 2025
Running gat model with rocm General	2	106	November 28, 2025
Why vllm cannot fully use GPU in batch processing General	12	1119	March 29, 2025
Run vLLM on two diffrent GPU General	1	963	May 21, 2025
Question about parameter: max-num-batched-tokens General	3	1749	November 25, 2025

Can 7900 XTX do batching?

Related topics