Some questions about torch.compile

wyzqwq · August 17, 2025, 12:23am

Hi vLLM team,

The article “PyTorch + vLLM = ” shows 1.05x–1.9x speedups with torch.compile for models like Llama4/Qwen3/Gemma3. To better understand:

What benchmark setup (hardware, input shape) was used?
Was this measured by comparing vLLM with vs. without torch.compile enabled?
Could you share a reference script/command for reproducing these tests?

Thanks for your insights! This will greatly help the community validate the optimizations.

RunLLM · August 17, 2025, 12:23am

The referenced speedups (1.05x–1.9x) for torch.compile in vLLM were measured using the vLLM benchmark suite, specifically with models like Llama3, Llama4, and Mixtral on NVIDIA H100 and AMD MI300x hardware. The benchmarks compare vLLM with torch.compile enabled (the V1 engine, where torch.compile is on by default) versus vLLM without torch.compile (V0 engine or with --enforce-eager). The vllm-benchmark workflow is used for validation, and results are published on the vLLM benchmark dashboard for transparency and reproducibility. Input shapes typically use batch sizes and sequence lengths relevant to LLM serving (e.g., input/output lengths of 128–256 tokens, batch sizes up to 256+), but exact shapes can be found in the benchmark scripts.

To reproduce, you can run:

python benchmarks/benchmark_throughput.py --input-len 256 --output-len 256 --model meta-llama/Meta-Llama-3-8B --num-scheduler-steps 64
# With torch.compile (V1, default)
# To disable torch.compile: add --enforce-eager

For more details, see the official torch.compile integration docs and release/benchmark instructions. Would you like a step-by-step breakdown or more details on hardware or input shape specifics?

Sources:

Topic		Replies	Views
How vllm v1 enable compile as default feature torch.compile integration	5	259	July 2, 2025
True eager backend General	6	65	February 24, 2026
Running INT4 models on CPU General	1	167	August 8, 2025
Questions on piecewise torch compile design torch.compile integration	50	1427	November 5, 2025
How to support shape-dependent branches with support_torch_compile (PIECEWISE)? General	9	161	August 22, 2025

Some questions about torch.compile

Related topics