(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] WorkerProc failed to start.
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] Traceback (most recent call last):
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] File “/usr/local/lib/python3.10/dist-packages/vllm/v1/executor/multiproc_executor.py”, line 461, in worker_main
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] worker = WorkerProc(*args, **kwargs)
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] File “/usr/local/lib/python3.10/dist-packages/vllm/v1/executor/multiproc_executor.py”, line 357, in init
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] self.worker.init_device()
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] File “/usr/local/lib/python3.10/dist-packages/vllm/worker/worker_base.py”, line 666, in init_device
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] self.worker.init_device() # type: ignore
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] File “/usr/local/lib/python3.10/dist-packages/vllm/v1/worker/gpu_worker.py”, line 135, in init_device
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] torch.cuda.set_device(self.device)
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] File “/usr/local/lib/python3.10/dist-packages/torch/cuda/init.py”, line 478, in set_device
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] torch._C._cuda_setDevice(device)
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] RuntimeError: HIP error: invalid device ordinal
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] HIP kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] For debugging consider passing AMD_SERIALIZE_KERNEL=3
(VllmWorker rank=12 pid=42139) ERROR 11-26 12:03:59 [multiproc_executor.py:487] Compile with TORCH_USE_HIP_DSA to enable device-side assertions.
The error HIP error: invalid device ordinal indicates that vLLM is trying to access a GPU device index that does not exist or is not available. This is often caused by a mismatch between the number of GPUs specified (e.g., via tensor_parallel_size) and the actual available GPUs, or by not setting the CUDA_VISIBLE_DEVICES (or for AMD, the equivalent ROCm variable) environment variable correctly. This can also happen if you request more parallelism than available devices, or if device IDs are not mapped as expected by the runtime. For ROCm/AMD, ensure your environment and device visibility are set up properly and that the number of GPUs matches your configuration. See similar issues and troubleshooting steps in this discussion and this one.
Would you like a step-by-step guide to check and fix your device configuration?
Sources:
Hint: Mention @RunLLM in the post for followups.