@RunLLM A detailed guide for patching vLLM to run FP8 models on R9700 would be awesome. The latest AMD docker image is rocm/vllm-dev:rocm7.2_navi_ubuntu24.04_py3.12_pytorch_2.9_vllm_0.14.0rc0, but will a more recent version of vLLM with the PRs 29008 and 32962 have steps to add FP8 support or are FP8 support steps only available for a vLLM version more in line with the date of the original post here?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| RDNA4 FP8 support | 1 | 33 | May 2, 2026 | |
| RuntimeError: Int8 not supported on SM120. Use FP8 quantization instead, or run on older arch (SM < 100) | 3 | 178 | November 27, 2025 | |
| Support for RTX 6000 Blackwell 96GB card | 5 | 5940 | January 5, 2026 | |
| RTX PRO6000运行Qwen3-32B-FP8量化版本出错 | 1 | 472 | July 4, 2025 | |
| LoRA Adapter enabling with vLLM is not working | 4 | 547 | April 21, 2025 |