When running vllm docker rocm and serving gemma3 it gives these warnings and gets stuck. Nothing happens, nothing is downloading. used to work 2x7900 XTX latest nightly dev.
WARNING 03-24 01:58:45 [gpt_oss_triton_kernels_moe.py:59] Using legacy triton_kernels on ROCm
WARNING 03-24 01:58:45 [gpt_oss_triton_kernels_moe.py:59] Using legacy triton_kernels on ROCm
The warning “Using legacy triton_kernels on ROCm” is expected on ROCm platforms and does not by itself indicate a fatal error. However, if vLLM gets stuck and nothing is downloading or progressing, this is not normal behavior. This issue has been reported before with Gemma models on AMD GPUs, and is often related to compatibility or environment setup, not the warning itself. According to Serving for Google Gemma model failing on AMD MI 300X GPUs, Gemma models may fail to serve on ROCm due to backend or kernel limitations.
If the process hangs before downloading or loading the model, check that your Docker container has proper permissions, the ROCm drivers are correctly installed, and the GPU is visible inside the container. Also, ensure you are using a supported ROCm version and vLLM image, as described in the official ROCm installation guide. If the problem persists, it may be a current limitation or bug with Gemma on ROCm—this was also observed on MI300X and MI210 GPUs. Would you like more detailed troubleshooting steps?