Updated rocm/vllm latest and not anymore working

youlearnit · November 4, 2025, 8:21pm

I am trying to run
docker run -it --dns=192.168.1.1 --network=host --group-add=video --ipc=host --cap-add=SYS_PTRACE --security-opt seccomp=unconfined --device /dev/kfd --device /dev/dri --shm-size=16g -e HUGGING_FACE_HUB_TOKEN=“” -e VLLM_SLEEP_WHEN_IDLE=1 -e ROCM_VISIBLE_DEVICES=0 -e HIP_VISIBLE_DEVICES=0 -e HSA_OVERRIDE_GFX_VERSION=11.0.0 -e PYTORCH_ROCM_ARCH=“gfx1100” -e VLLM_USE_TRITON_FLASH_ATTN=0 -e GPU_MAX_HW_QUEUES=1 -e NCCL_DEBUG=WARN -e NCCL_IB_DISABLE=1 --restart unless-stopped --name vllm_rocm_gemma-3-27b-it-GPTQ-4b-128g-2 -v /home/ubuntu/vllm_models:/root/.cache/huggingface rocm/vllm:latest vllm serve ISTA-DASLab/gemma-3-27b-it-GPTQ-4b-128g --host 0.0.0.0 --port 8000 --enforce-eager --served-model-name vllm/gemma-3 --trust-remote-code --dtype bfloat16 --kv-cache-dtype auto --max-model-len 2048 --max-num-seqs 4 --max-num-batched-tokens 2048 --gpu-memory-utilization 0.94 --swap-space 24 --disable-log-requests --disable-log-stats --max-log-len 100

but getting this error with 1x 7900 XTX. It used to work before the docker pull update.

(EngineCore_DP0 pid=56) INFO 11-04 20:18:38 [parallel_state.py:1325] rank 0 in world size 1 is assigned as DP rank 0, PP rank 0, TP rank 0, EP rank 0
(EngineCore_DP0 pid=56) Using a slow image processor as use_fast is unset and a slow processor was saved with this model. use_fast=True will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You’ll still be able to use a slow processor with use_fast=False.
(EngineCore_DP0 pid=56) INFO 11-04 20:18:48 [gpu_model_runner.py:2843] Starting to load model ISTA-DASLab/gemma-3-27b-it-GPTQ-4b-128g…
(EngineCore_DP0 pid=56) INFO 11-04 20:18:49 [base.py:98] Using Transformers backend.
(EngineCore_DP0 pid=56) INFO 11-04 20:18:49 [compressed_tensors_wNa16.py:108] Using ConchLinearKernel for CompressedTensorsWNA16
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] EngineCore failed to start.
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] Traceback (most recent call last):
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py”, line 784, in run_engine_core
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py”, line 552, in init
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] super().init(
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py”, line 106, in init
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] self.model_executor = executor_class(vllm_config)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/executor/executor_base.py”, line 54, in init
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] self._init_executor()
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py”, line 48, in _init_executor
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] self.collective_rpc(“load_model”)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py”, line 74, in collective_rpc
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] return [run_method(self.driver_worker, method, args, kwargs)]
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/utils/init.py”, line 2089, in run_method
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] return func(*args, **kwargs)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py”, line 229, in load_model
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] self.model_runner.load_model(eep_scale_up=eep_scale_up)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py”, line 2873, in load_model
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] self.model = model_loader.load_model(
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py”, line 49, in load_model
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] model = initialize_model(
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] ^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py”, line 65, in initialize_model
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] return model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/compilation/decorators.py”, line 234, in init
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/multimodal.py”, line 293, in init
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] super(SupportsMRoPE, self).init(vllm_config=vllm_config, prefix=prefix)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/causal.py”, line 35, in init
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] super(VllmModelForTextGeneration, self).init(
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 145, in init
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] self.recursive_replace()
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 278, in recursive_replace
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] _recursive_replace(self.model, prefix=“model”)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 272, in _recursive_replace
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] _recursive_replace(child_module, prefix=qual_name)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 272, in _recursive_replace
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] _recursive_replace(child_module, prefix=qual_name)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 272, in _recursive_replace
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] _recursive_replace(child_module, prefix=qual_name)
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] [Previous line repeated 3 more times]
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 264, in _recursive_replace
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] new_module = replace_linear_class(
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/utils.py”, line 127, in replace_linear_class
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] return vllm_linear_cls(
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] ^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/linear.py”, line 362, in init
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] self.quant_method.create_weights(
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors.py”, line 842, in create_weights
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] layer.scheme.create_weights(
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_wNa16.py”, line 121, in create_weights
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] assert input_size_per_partition % group_size == 0
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) ERROR 11-04 20:18:49 [core.py:793] AssertionError
(EngineCore_DP0 pid=56) Process EngineCore_DP0:
(EngineCore_DP0 pid=56) Traceback (most recent call last):
(EngineCore_DP0 pid=56) File “/usr/lib/python3.12/multiprocessing/process.py”, line 314, in _bootstrap
(EngineCore_DP0 pid=56) self.run()
(EngineCore_DP0 pid=56) File “/usr/lib/python3.12/multiprocessing/process.py”, line 108, in run
(EngineCore_DP0 pid=56) self._target(*self._args, **self._kwargs)
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py”, line 797, in run_engine_core
(EngineCore_DP0 pid=56) raise e
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py”, line 784, in run_engine_core
(EngineCore_DP0 pid=56) engine_core = EngineCoreProc(*args, **kwargs)
(EngineCore_DP0 pid=56) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py”, line 552, in init
(EngineCore_DP0 pid=56) super().init(
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core.py”, line 106, in init
(EngineCore_DP0 pid=56) self.model_executor = executor_class(vllm_config)
(EngineCore_DP0 pid=56) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/executor/executor_base.py”, line 54, in init
(EngineCore_DP0 pid=56) self._init_executor()
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py”, line 48, in _init_executor
(EngineCore_DP0 pid=56) self.collective_rpc(“load_model”)
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/executor/uniproc_executor.py”, line 74, in collective_rpc
(EngineCore_DP0 pid=56) return [run_method(self.driver_worker, method, args, kwargs)]
(EngineCore_DP0 pid=56) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/utils/init.py”, line 2089, in run_method
(EngineCore_DP0 pid=56) return func(*args, **kwargs)
(EngineCore_DP0 pid=56) ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_worker.py”, line 229, in load_model
(EngineCore_DP0 pid=56) self.model_runner.load_model(eep_scale_up=eep_scale_up)
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/worker/gpu_model_runner.py”, line 2873, in load_model
(EngineCore_DP0 pid=56) self.model = model_loader.load_model(
(EngineCore_DP0 pid=56) ^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/base_loader.py”, line 49, in load_model
(EngineCore_DP0 pid=56) model = initialize_model(
(EngineCore_DP0 pid=56) ^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/model_loader/utils.py”, line 65, in initialize_model
(EngineCore_DP0 pid=56) return model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore_DP0 pid=56) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/compilation/decorators.py”, line 234, in init
(EngineCore_DP0 pid=56) old_init(self, vllm_config=vllm_config, prefix=prefix, **kwargs)
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/multimodal.py”, line 293, in init
(EngineCore_DP0 pid=56) super(SupportsMRoPE, self).init(vllm_config=vllm_config, prefix=prefix)
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/causal.py”, line 35, in init
(EngineCore_DP0 pid=56) super(VllmModelForTextGeneration, self).init(
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 145, in init
(EngineCore_DP0 pid=56) self.recursive_replace()
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 278, in recursive_replace
(EngineCore_DP0 pid=56) _recursive_replace(self.model, prefix=“model”)
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 272, in _recursive_replace
(EngineCore_DP0 pid=56) _recursive_replace(child_module, prefix=qual_name)
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 272, in _recursive_replace
(EngineCore_DP0 pid=56) _recursive_replace(child_module, prefix=qual_name)
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 272, in _recursive_replace
(EngineCore_DP0 pid=56) _recursive_replace(child_module, prefix=qual_name)
(EngineCore_DP0 pid=56) [Previous line repeated 3 more times]
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/base.py”, line 264, in _recursive_replace
(EngineCore_DP0 pid=56) new_module = replace_linear_class(
(EngineCore_DP0 pid=56) ^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/models/transformers/utils.py”, line 127, in replace_linear_class
(EngineCore_DP0 pid=56) return vllm_linear_cls(
(EngineCore_DP0 pid=56) ^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/linear.py”, line 362, in init
(EngineCore_DP0 pid=56) self.quant_method.create_weights(
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/quantization/compressed_tensors/compressed_tensors.py”, line 842, in create_weights
(EngineCore_DP0 pid=56) layer.scheme.create_weights(
(EngineCore_DP0 pid=56) File “/usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/quantization/compressed_tensors/schemes/compressed_tensors_wNa16.py”, line 121, in create_weights
(EngineCore_DP0 pid=56) assert input_size_per_partition % group_size == 0
(EngineCore_DP0 pid=56) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(EngineCore_DP0 pid=56) AssertionError
[rank0]:[W1104 20:18:50.537129993 ProcessGroupNCCL.cpp:1522] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see Distributed communication package - torch.distributed — PyTorch 2.9 documentation (function operator()) (APIServer pid=1) Traceback (most recent call last): (APIServer pid=1) File “/usr/local/bin/vllm”, line 7, in (APIServer pid=1) sys.exit(main()) (APIServer pid=1) ^^^^^^ (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/main.py”, line 73, in main (APIServer pid=1) args.dispatch_function(args) (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/cli/serve.py”, line 62, in cmd (APIServer pid=1) uvloop.run(run_server(args)) (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/uvloop/**init**.py”, line 96, in run (APIServer pid=1) return __asyncio.run( (APIServer pid=1) ^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/lib/python3.12/asyncio/runners.py”, line 195, in run (APIServer pid=1) return runner.run(main) (APIServer pid=1) ^^^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/lib/python3.12/asyncio/runners.py”, line 118, in run (APIServer pid=1) return self._loop.run_until_complete(task) (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File “uvloop/loop.pyx”, line 1518, in uvloop.loop.Loop.run_until_complete (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/uvloop/**init**.py”, line 48, in wrapper (APIServer pid=1) return await main (APIServer pid=1) ^^^^^^^^^^ (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py”, line 1920, in run_server (APIServer pid=1) await run_server_worker(listen_address, sock, args, **uvicorn_kwargs) (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py”, line 1936, in run_server_worker (APIServer pid=1) async with build_async_engine_client( (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/lib/python3.12/contextlib.py”, line 210, in **aenter** (APIServer pid=1) return await anext(self.gen) (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py”, line 191, in build_async_engine_client (APIServer pid=1) async with build_async_engine_client_from_engine_args( (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/lib/python3.12/contextlib.py”, line 210, in **aenter** (APIServer pid=1) return await anext(self.gen) (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/api_server.py”, line 238, in build_async_engine_client_from_engine_args (APIServer pid=1) async_llm = AsyncLLM.from_vllm_config( (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/utils/functools.py”, line 116, in inner (APIServer pid=1) return fn(*args, **kwargs) (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py”, line 210, in from_vllm_config (APIServer pid=1) return cls( (APIServer pid=1) ^^^^ (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/async_llm.py”, line 132, in **init** (APIServer pid=1) self.engine_core = EngineCoreClient.make_async_mp_client( (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py”, line 121, in make_async_mp_client (APIServer pid=1) return AsyncMPClient(*client_args) (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py”, line 807, in **init** (APIServer pid=1) super().**init**( (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py”, line 468, in **init** (APIServer pid=1) with launch_core_engines(vllm_config, executor_class, log_stats) as ( (APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ (APIServer pid=1) File “/usr/lib/python3.12/contextlib.py”, line 144, in **exit** (APIServer pid=1) next(self.gen) (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py”, line 879, in launch_core_engines (APIServer pid=1) wait_for_engine_startup( (APIServer pid=1) File “/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py”, line 936, in wait_for_engine_startup (APIServer pid=1) raise RuntimeError( (APIServer pid=1) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}`

youlearnit · November 4, 2025, 8:31pm

Okey, I fixed it by pulling the previous version of docker rocm/vllm:rocm image.

So with this:
rocm/vllm:rocm7.0.0_vllm_0.10.2_20251006
All works with 7900 XTX and I can serve for example gemma-3-27b-it-GPTQ-4b-128g.

But with the latest:
rocm/vllm:rocm7.0.0_vllm_0.11.1_20251103
There seems to be some kind of memory leak etc.
So its maybe a problem with the vllm or something in that image.

Topic		Replies	Views
Why latest rocm vllm is so bad? General	3	532	December 14, 2025
Is the reason for my vllm 0.20.0 failing to start because of nixl? General	9	384	June 3, 2026
Running gat model with rocm General	2	200	November 28, 2025
Not able to run GLM-4.5-Air on rocm 7.0 with 2x 7900 xtx General	1	307	October 17, 2025
HIP failure: the operation cannot be performed in the present state General	7	161	July 2, 2026

Updated rocm/vllm latest and not anymore working

Related topics