Native FP8 WMMA Support for AMD RDNA4 (RX 9070 XT / R9700) in vLLM

jmander11 · February 12, 2026, 5:21am

@RunLLM A detailed guide for patching vLLM to run FP8 models on R9700 would be awesome. The latest AMD docker image is rocm/vllm-dev:rocm7.2_navi_ubuntu24.04_py3.12_pytorch_2.9_vllm_0.14.0rc0, but will a more recent version of vLLM with the PRs 29008 and 32962 have steps to add FP8 support or are FP8 support steps only available for a vLLM version more in line with the date of the original post here?

Topic		Replies	Views
RDNA4 FP8 support General	1	33	May 2, 2026
RuntimeError: Int8 not supported on SM120. Use FP8 quantization instead, or run on older arch (SM < 100) NVIDIA GPU Support	3	178	November 27, 2025
Support for RTX 6000 Blackwell 96GB card NVIDIA GPU Support	5	5940	January 5, 2026
RTX PRO6000运行Qwen3-32B-FP8量化版本出错 General	1	472	July 4, 2025
LoRA Adapter enabling with vLLM is not working LoRA	4	547	April 21, 2025

Native FP8 WMMA Support for AMD RDNA4 (RX 9070 XT / R9700) in vLLM

Related topics