@RunLLM
报错日志如下:
2025-11-24 16:11:04,629 E 561378 590379] core_worker_process.cc:837: Failed to establish connection to the metrics exporter agent. Metrics will not be exported. Exporter agent status: RpcError: Running out of retries to initialize the metrics agent. rpc_code: 14 [repeated 24x across cluster]
(EngineCore_DP7 pid=558751) (RayWorkerWrapper pid=561301) The image processor of type Qwen2VLImageProcessor is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with use_fast=False. Note that this behavior will be extended to all models in a future release.
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] EngineCore failed to start.
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] Traceback (most recent call last):
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 829, in run_engine_core
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] engine_core = DPEngineCoreProc(*args, **kwargs)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 1124, in init
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] super().init(
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 606, in init
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] super().init(
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 102, in init
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] self.model_executor = executor_class(vllm_config)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/abstract.py”, line 101, in init
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] self._init_executor()
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/ray_executor.py”, line 97, in _init_executor
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] self._init_workers_ray(placement_group)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/ray_executor.py”, line 371, in _init_workers_ray
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] self.collective_rpc(“load_model”)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/ray_executor.py”, line 493, in collective_rpc
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] return ray.get(ray_worker_outputs, timeout=timeout)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/auto_init_hook.py”, line 22, in auto_init_wrapper
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] return fn(*args, **kwargs)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/client_mode_hook.py”, line 104, in wrapper
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] return func(*args, **kwargs)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/worker.py”, line 2972, in get
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] values, debugger_breakpoint = worker.get_objects(
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/worker.py”, line 1031, in get_objects
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] raise value.as_instanceof_cause()
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] ray.exceptions.RayTaskError(RuntimeError): ray::RayWorkerWrapper.execute_method() (pid=561355, ip=10.168.1.19, actor_id=b88510a5e87f5580878dafb701000000, repr=<vllm.v1.executor.ray_utils.RayWorkerWrapper object at 0x7f91016e5ab0>)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/worker_base.py”, line 343, in execute_method
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] raise e
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/worker_base.py”, line 332, in execute_method
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] return run_method(self, method, args, kwargs)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/serial_utils.py”, line 479, in run_method
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] return func(*args, **kwargs)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py”, line 273, in load_model
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] self.model_runner.load_model(eep_scale_up=eep_scale_up)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/gpu_model_runner.py”, line 3276, in load_model
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] self.model = model_loader.load_model(
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/model_loader/base_loader.py”, line 49, in load_model
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] model = initialize_model(
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/model_loader/utils.py”, line 55, in initialize_model
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] return model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py”, line 1237, in init
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] self.visual = Qwen2_5_VisionTransformer(
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py”, line 692, in init
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] raise RuntimeError(
(EngineCore_DP2 pid=558006) ERROR 11-24 16:11:15 [core.py:842] RuntimeError: Qwen2.5-VL does not support AttentionBackendEnum.FLASHINFER backend now.
(EngineCore_DP2 pid=558006) Process EngineCore_DP2:
(EngineCore_DP2 pid=558006) Traceback (most recent call last):
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/multiprocessing/process.py”, line 314, in _bootstrap
(EngineCore_DP2 pid=558006) self.run()
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/multiprocessing/process.py”, line 108, in run
(EngineCore_DP2 pid=558006) self._target(*self._args, **self._kwargs)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 846, in run_engine_core
(EngineCore_DP2 pid=558006) raise e
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 829, in run_engine_core
(EngineCore_DP2 pid=558006) engine_core = DPEngineCoreProc(*args, **kwargs)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 1124, in init
(EngineCore_DP2 pid=558006) super().init(
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 606, in init
(EngineCore_DP2 pid=558006) super().init(
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 102, in init
(EngineCore_DP2 pid=558006) self.model_executor = executor_class(vllm_config)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/abstract.py”, line 101, in init
(EngineCore_DP2 pid=558006) self._init_executor()
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/ray_executor.py”, line 97, in _init_executor
(EngineCore_DP2 pid=558006) self._init_workers_ray(placement_group)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/ray_executor.py”, line 371, in _init_workers_ray
(EngineCore_DP2 pid=558006) self.collective_rpc(“load_model”)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/ray_executor.py”, line 493, in collective_rpc
(EngineCore_DP2 pid=558006) return ray.get(ray_worker_outputs, timeout=timeout)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/auto_init_hook.py”, line 22, in auto_init_wrapper
(EngineCore_DP2 pid=558006) return fn(*args, **kwargs)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/client_mode_hook.py”, line 104, in wrapper
(EngineCore_DP2 pid=558006) return func(*args, **kwargs)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/worker.py”, line 2972, in get
(EngineCore_DP2 pid=558006) values, debugger_breakpoint = worker.get_objects(
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/worker.py”, line 1031, in get_objects
(EngineCore_DP2 pid=558006) raise value.as_instanceof_cause()
(EngineCore_DP2 pid=558006) ray.exceptions.RayTaskError(RuntimeError): ray::RayWorkerWrapper.execute_method() (pid=561355, ip=10.168.1.19, actor_id=b88510a5e87f5580878dafb701000000, repr=<vllm.v1.executor.ray_utils.RayWorkerWrapper object at 0x7f91016e5ab0>)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/worker_base.py”, line 343, in execute_method
(EngineCore_DP2 pid=558006) raise e
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/worker_base.py”, line 332, in execute_method
(EngineCore_DP2 pid=558006) return run_method(self, method, args, kwargs)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/serial_utils.py”, line 479, in run_method
(EngineCore_DP2 pid=558006) return func(*args, **kwargs)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py”, line 273, in load_model
(EngineCore_DP2 pid=558006) self.model_runner.load_model(eep_scale_up=eep_scale_up)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/gpu_model_runner.py”, line 3276, in load_model
(EngineCore_DP2 pid=558006) self.model = model_loader.load_model(
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/model_loader/base_loader.py”, line 49, in load_model
(EngineCore_DP2 pid=558006) model = initialize_model(
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/model_loader/utils.py”, line 55, in initialize_model
(EngineCore_DP2 pid=558006) return model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py”, line 1237, in init
(EngineCore_DP2 pid=558006) self.visual = Qwen2_5_VisionTransformer(
(EngineCore_DP2 pid=558006) File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py”, line 692, in init
(EngineCore_DP2 pid=558006) raise RuntimeError(
(EngineCore_DP2 pid=558006) RuntimeError: Qwen2.5-VL does not support AttentionBackendEnum.FLASHINFER backend now.
(EngineCore_DP2 pid=558006) INFO 11-24 16:11:15 [ray_executor.py:121] Shutting down Ray distributed executor. If you see error log from logging.cc regarding SIGTERM received, please ignore because this is the expected termination process in Ray.
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] Error executing method ‘load_model’. This might cause deadlock in distributed execution.
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] Traceback (most recent call last):
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/worker_base.py”, line 332, in execute_method
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] return run_method(self, method, args, kwargs)
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/serial_utils.py”, line 479, in run_method
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] return func(*args, **kwargs)
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py”, line 273, in load_model
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] self.model_runner.load_model(eep_scale_up=eep_scale_up)
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/gpu_model_runner.py”, line 3276, in load_model
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] self.model = model_loader.load_model(
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/model_loader/base_loader.py”, line 49, in load_model
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] model = initialize_model(
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/model_loader/utils.py”, line 55, in initialize_model
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] return model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py”, line 1237, in init
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] self.visual = Qwen2_5_VisionTransformer(
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py”, line 692, in init
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] raise RuntimeError(
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) ERROR 11-24 16:11:15 [worker_base.py:342] RuntimeError: Qwen2.5-VL does not support AttentionBackendEnum.FLASHINFER backend now.
(EngineCore_DP6 pid=558603) (RayWorkerWrapper pid=561740) Downloading Model from https://www.modelscope.cn to directory: /mnt/workspace/.cache/modelscope/models/Qwen/Qwen2.5-VL-3B-Instruct
(EngineCore_DP2 pid=558006) (EngineCore_DP6 pid=558603) (RayWorkerWrapper pid=561740) 2025-11-24 16:11:15,559 - modelscope - INFO - Target directory already exists, skipping creation.
(EngineCore_DP6 pid=558603) (pid=561375) [2025-11-24 16:11:04,626 E 561375 590323] core_worker_process.cc:837: Failed to establish connection to the metrics exporter agent. Metrics will not be exported. Exporter agent status: RpcError: Running out of retries to initialize the metrics agent. rpc_code: 14 [repeated 7x across cluster]
(EngineCore_DP2 pid=558006) (RayWorkerWrapper pid=561355) [rank2]:[W1124 16:11:15.215572426 ProcessGroupNCCL.cpp:1524] Warning: WARNING: destroy_process_group() was not called before program exit, which can leak resources. For more info, please see Distributed communication package - torch.distributed — PyTorch 2.9 documentation (function operator())
(EngineCore_DP6 pid=558603) (RayWorkerWrapper pid=561740) The image processor of type Qwen2VLImageProcessor is now loaded as a fast processor by default, even if the model checkpoint was saved with a slow processor. This is a breaking change and may produce slightly different outputs. To continue using the slow processor, instantiate this class with use_fast=False. Note that this behavior will be extended to all models in a future release.
(EngineCore_DP1 pid=557850) (RayWorkerWrapper pid=561366) Downloading Model from https://www.modelscope.cn to directory: /mnt/workspace/.cache/modelscope/models/Qwen/Qwen2.5-VL-3B-Instruct
(EngineCore_DP3 pid=558153) (RayWorkerWrapper pid=561454) dsw-643752-b77cb64b8-b69sf:561454:591574 [0] NCCL INFO [Service thread] Connection closed by localRank 2
(EngineCore_DP3 pid=558153) (RayWorkerWrapper pid=561454) dsw-643752-b77cb64b8-b69sf:561454:591502 [0] NCCL INFO [Service thread] Connection closed by localRank 2
(EngineCore_DP1 pid=557850) (RayWorkerWrapper pid=561366) 2025-11-24 16:11:16,439 - modelscope - INFO - Target directory already exists, skipping creation.
(EngineCore_DP1 pid=557850) (pid=561443) [2025-11-24 16:11:04,676 E 561443 590620] core_worker_process.cc:837: Failed to establish connection to the metrics exporter agent. Metrics will not be exported. Exporter agent status: RpcError: Running out of retries to initialize the metrics agent. rpc_code: 14 [repeated 26x across cluster]
(EngineCore_DP1 pid=557850) (RayWorkerWrapper pid=561366) dsw-643752-b77cb64b8-b69sf:561366:591562 [0] NCCL INFO [Service thread] Connection closed by localRank 2
(EngineCore_DP1 pid=557850) (RayWorkerWrapper pid=561366) dsw-643752-b77cb64b8-b69sf:561366:591510 [0] NCCL INFO [Service thread] Connection closed by localRank 2
(EngineCore_DP0 pid=557578) (RayWorkerWrapper pid=561527) dsw-643752-b77cb64b8-b69sf:561527:591566 [0] NCCL INFO [Service thread] Connection closed by localRank 2
(EngineCore_DP0 pid=557578) (RayWorkerWrapper pid=561527) dsw-643752-b77cb64b8-b69sf:561527:591496 [0] NCCL INFO [Service thread] Connection closed by localRank 2
(EngineCore_DP4 pid=558301) (EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] EngineCore failed to start.
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] Traceback (most recent call last):
(RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] Error executing method ‘load_model’. This might cause deadlock in distributed execution.(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 829, in run_engine_core
(EngineCore_DP4 pid=558301)
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] Traceback (most recent call last):ERROR 11-24 16:11:16 [core.py:842] engine_core = DPEngineCoreProc(*args, **kwargs)
(EngineCore_DP4 pid=558301)
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/worker_base.py”, line 332, in execute_methodERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 1124, in init
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] super().init(
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] return run_method(self, method, args, kwargs)
(EngineCore_DP4 pid=558301) (EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/serial_utils.py”, line 479, in run_methodERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 606, in init
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] super().init(
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] return func(*args, **kwargs)
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py”, line 273, in load_model(EngineCore_DP4 pid=558301)
ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/engine/core.py”, line 102, in init
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] self.model_runner.load_model(eep_scale_up=eep_scale_up)(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] self.model_executor = executor_class(vllm_config)
(EngineCore_DP4 pid=558301)
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/gpu_model_runner.py”, line 3276, in load_modelERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/abstract.py”, line 101, in init
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] self._init_executor()
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] self.model = model_loader.load_model(
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/model_loader/base_loader.py”, line 49, in load_model
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] model = initialize_model(
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/ray_executor.py”, line 97, in _init_executor
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] self._init_workers_ray(placement_group)
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/ray_executor.py”, line 371, in _init_workers_ray
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/model_loader/utils.py”, line 55, in initialize_model
(EngineCore_DP4 pid=558301) (EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] return model_class(vllm_config=vllm_config, prefix=prefix)ERROR 11-24 16:11:16 [core.py:842] self.collective_rpc(“load_model”)
(EngineCore_DP4 pid=558301)
ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/executor/ray_executor.py”, line 493, in collective_rpc
(EngineCore_DP4 pid=558301) (EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py”, line 1237, in __init__ERROR 11-24 16:11:16 [core.py:842] return ray.get(ray_worker_outputs, timeout=timeout)
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/auto_init_hook.py”, line 22, in auto_init_wrapper
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] self.visual = Qwen2_5_VisionTransformer(
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] return fn(*args, **kwargs)
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/client_mode_hook.py”, line 104, in wrapper
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py”, line 692, in init(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] return func(*args, **kwargs)
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/worker.py”, line 2972, in get
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] values, debugger_breakpoint = worker.get_objects(
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/ray/_private/worker.py”, line 1031, in get_objects
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] raise RuntimeError((EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] raise value.as_instanceof_cause()
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] ray.exceptions.RayTaskError(RuntimeError): ray::RayWorkerWrapper.execute_method() (pid=561265, ip=10.168.1.19, actor_id=fb144d0dbe310eb1fdb084af01000000, repr=<vllm.v1.executor.ray_utils.RayWorkerWrapper object at 0x7fc8e86f5ab0>)
(EngineCore_DP4 pid=558301)
(EngineCore_DP4 pid=558301) (RayWorkerWrapper pid=561265) ERROR 11-24 16:11:16 [worker_base.py:342] RuntimeError: Qwen2.5-VL does not support AttentionBackendEnum.FLASHINFER backend now.ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/worker_base.py”, line 343, in execute_method
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] raise e
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/worker_base.py”, line 332, in execute_method
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] return run_method(self, method, args, kwargs)
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/serial_utils.py”, line 479, in run_method
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] return func(*args, **kwargs)
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/gpu_worker.py”, line 273, in load_model
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] self.model_runner.load_model(eep_scale_up=eep_scale_up)
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/v1/worker/gpu_model_runner.py”, line 3276, in load_model
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] self.model = model_loader.load_model(
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/model_loader/base_loader.py”, line 49, in load_model
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] model = initialize_model(
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/model_loader/utils.py”, line 55, in initialize_model
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] return model_class(vllm_config=vllm_config, prefix=prefix)
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py”, line 1237, in init
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] self.visual = Qwen2_5_VisionTransformer(
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] File “/mnt/data/cpfs/miniconda3/envs/flashinfer/lib/python3.10/site-packages/vllm/model_executor/models/qwen2_5_vl.py”, line 692, in init
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] raise RuntimeError(
(EngineCore_DP4 pid=558301) ERROR 11-24 16:11:16 [core.py:842] RuntimeError: Qwen2.5-VL does not support AttentionBackendEnum.FLASHINFER backend now.