(RayWorkerWrapper pid=28215, ip=192.168.13.141) WARNING 07-21 12:37:48 [config.py:1633] Possibly too large swap space. 16.00 GiB out of the 31.19 GiB total CPU memory is allocated for the swap space.
Loading safetensors checkpoint shards: 0% Completed | 0/5 [00:00<?, ?it/s]
Loading safetensors checkpoint shards: 40% Completed | 2/5 [00:00<00:00, 6.54it/s]
Loading safetensors checkpoint shards: 60% Completed | 3/5 [00:00<00:00, 5.11it/s]
Loading safetensors checkpoint shards: 80% Completed | 4/5 [00:00<00:00, 4.56it/s]
Loading safetensors checkpoint shards: 100% Completed | 5/5 [00:01<00:00, 4.29it/s]
Loading safetensors checkpoint shards: 100% Completed | 5/5 [00:01<00:00, 4.63it/s]
INFO 07-21 12:37:49 [default_loader.py:272] Loading weights took 1.11 seconds
INFO 07-21 12:37:50 [model_runner.py:1203] Model loading took 3.9972 GiB and 1.294743 seconds
(RayWorkerWrapper pid=28215, ip=192.168.13.141) INFO 07-21 12:37:51 [default_loader.py:272] Loading weights took 2.93 seconds
(pid=73437, ip=192.168.13.143) INFO 07-21 12:37:46 [__init__.py:244] Automatically detected platform cuda. [repeated 3x across cluster] (Ray deduplicates logs by default. Set RAY_DEDUP_LOGS=0 to disable log deduplication, or see https://docs.ray.io/en/master/ray-observability/user-guides/configure-logging.html#log-deduplication for more options.)
(RayWorkerWrapper pid=28215, ip=192.168.13.141) INFO 07-21 12:37:51 [model_runner.py:1203] Model loading took 3.9972 GiB and 3.104790 seconds
(RayWorkerWrapper pid=28215, ip=192.168.13.141) Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`.
You have video processor config saved in `preprocessor.json` file which is deprecated. Video processor configs should be saved in their own `video_preprocessor.json` file. You can rename the file or load and save the processor back which renames it automatically. Loading from `preprocessor.json` will be removed in v5.0.
(RayWorkerWrapper pid=28215, ip=192.168.13.141) You have video processor config saved in `preprocessor.json` file which is deprecated. Video processor configs should be saved in their own `video_preprocessor.json` file. You can rename the file or load and save the processor back which renames it automatically. Loading from `preprocessor.json` will be removed in v5.0.
(RayWorkerWrapper pid=73437, ip=192.168.13.143) /tmp/tmpjxa3f9r3/main.c:5:10: fatal error: Python.h: No such file or directory
(RayWorkerWrapper pid=73437, ip=192.168.13.143) 5 | #include <Python.h>
(RayWorkerWrapper pid=73437, ip=192.168.13.143) | ^~~~~~~~~~
(RayWorkerWrapper pid=73437, ip=192.168.13.143) compilation terminated.
(RayWorkerWrapper pid=73437, ip=192.168.13.143) The original cause of the RayTaskError (<class 'torch._inductor.exc.InductorError'>) isn't serializable: cannot pickle 'frame' object. Overwriting the cause to a RayError.
(RayWorkerWrapper pid=73437, ip=192.168.13.143) Using a slow image processor as `use_fast` is unset and a slow processor was saved with this model. `use_fast=True` will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You'll still be able to use a slow processor with `use_fast=False`. [repeated 2x across cluster]
(RayWorkerWrapper pid=73437, ip=192.168.13.143) You have video processor config saved in `preprocessor.json` file which is deprecated. Video processor configs should be saved in their own `video_preprocessor.json` file. You can rename the file or load and save the processor back which renames it automatically. Loading from `preprocessor.json` will be removed in v5.0. [repeated 2x across cluster]
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] Error executing method 'determine_num_available_blocks'. This might cause deadlock in distributed execution.
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] Traceback (most recent call last):
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/worker/worker_base.py", line 614, in execute_method
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return run_method(self, method, args, kwargs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/utils/__init__.py", line 2736, in run_method
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return func(*args, **kwargs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return func(*args, **kwargs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/worker/worker.py", line 256, in determine_num_available_blocks
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] self.model_runner.profile_run()
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return func(*args, **kwargs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/worker/model_runner.py", line 1300, in profile_run
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] self._dummy_run(max_num_batched_tokens, max_num_seqs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/worker/model_runner.py", line 1426, in _dummy_run
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] self.execute_model(model_input, kv_caches, intermediate_tensors)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return func(*args, **kwargs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/worker/model_runner.py", line 1844, in execute_model
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] hidden_or_intermediate_states = model_executable(
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return self._call_impl(*args, **kwargs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return forward_call(*args, **kwargs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_5_vl.py", line 1139, in forward
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] inputs_embeds = self.get_input_embeddings_v0(
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_5_vl.py", line 1071, in get_input_embeddings_v0
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] inputs_embeds = self.get_input_embeddings(input_ids)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/model_executor/models/qwen2_5_vl.py", line 1057, in get_input_embeddings
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] inputs_embeds = self.language_model.get_input_embeddings(input_ids)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/model_executor/models/qwen2.py", line 469, in get_input_embeddings
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return self.model.get_input_embeddings(input_ids)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/model_executor/models/qwen2.py", line 335, in get_input_embeddings
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return self.embed_tokens(input_ids)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1751, in _wrapped_call_impl
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return self._call_impl(*args, **kwargs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/nn/modules/module.py", line 1762, in _call_impl
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return forward_call(*args, **kwargs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/vllm/model_executor/layers/vocab_parallel_embedding.py", line 409, in forward
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] masked_input, input_mask = get_masked_input_and_mask(
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_dynamo/eval_frame.py", line 663, in _fn
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] raise e.remove_dynamo_frames() from None # see TORCHDYNAMO_VERBOSE=1
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/compile_fx.py", line 760, in _compile_fx_inner
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] raise InductorError(e, currentframe()).with_traceback(
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/compile_fx.py", line 745, in _compile_fx_inner
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] mb_compiled_graph = fx_codegen_and_compile(
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/compile_fx.py", line 1295, in fx_codegen_and_compile
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return scheme.codegen_and_compile(gm, example_inputs, inputs_to_check, graph_kwargs)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/compile_fx.py", line 1197, in codegen_and_compile
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] compiled_fn = graph.compile_to_module().call
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/graph.py", line 2083, in compile_to_module
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return self._compile_to_module()
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/graph.py", line 2091, in _compile_to_module
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] self.codegen_with_cpp_wrapper() if self.cpp_wrapper else self.codegen()
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/graph.py", line 2002, in codegen
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] self.scheduler.codegen()
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/scheduler.py", line 4135, in codegen
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] else self._codegen(self.nodes)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/scheduler.py", line 4264, in _codegen
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] self.get_backend(device).codegen_node(node)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/codegen/cuda_combined_scheduling.py", line 104, in codegen_node
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return self._triton_scheduling.codegen_node(node)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/codegen/simd.py", line 1320, in codegen_node
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return self.codegen_node_schedule(
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/codegen/simd.py", line 1365, in codegen_node_schedule
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] src_code = kernel.codegen_kernel()
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/codegen/triton.py", line 3623, in codegen_kernel
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] **self.inductor_meta_common(),
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/_inductor/codegen/triton.py", line 3447, in inductor_meta_common
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] "backend_hash": torch.utils._triton.triton_hash_with_backend(),
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/utils/_triton.py", line 111, in triton_hash_with_backend
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] backend = triton_backend()
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/torch/utils/_triton.py", line 103, in triton_backend
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] target = driver.active.get_current_target()
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/triton/runtime/driver.py", line 23, in __getattr__
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] self._initialize_obj()
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/triton/runtime/driver.py", line 20, in _initialize_obj
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] self._obj = self._init_fn()
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/triton/runtime/driver.py", line 9, in _create_driver
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] return actives[0]()
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/triton/backends/nvidia/driver.py", line 535, in __init__
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] self.utils = CudaUtils() # TODO: make static
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/triton/backends/nvidia/driver.py", line 89, in __init__
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/triton/backends/nvidia/driver.py", line 66, in compile_module_from_src
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] so = _build(name, src_path, tmpdir, library_dirs(), include_dir, libraries)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/triton/runtime/build.py", line 36, in _build
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] subprocess.check_call(cc_cmd, stdout=subprocess.DEVNULL)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] File "/usr/lib/python3.12/subprocess.py", line 413, in check_call
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] raise CalledProcessError(retcode, cmd)
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] torch._inductor.exc.InductorError: CalledProcessError: Command '['/usr/bin/gcc', '/tmp/tmpjxa3f9r3/main.c', '-O3', '-shared', '-fPIC', '-Wno-psabi', '-o', '/tmp/tmpjxa3f9r3/cuda_utils.cpython-312-x86_64-linux-gnu.so', '-lcuda', '-L/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/triton/backends/nvidia/lib', '-L/lib/x86_64-linux-gnu', '-I/home/deepseek-fb/vllm_env/lib/python3.12/site-packages/triton/backends/nvidia/include', '-I/tmp/tmpjxa3f9r3', '-I/usr/include/python3.12']' returned non-zero exit status 1.
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622]
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622] Set TORCHDYNAMO_VERBOSE=1 for the internal stack trace (please do this especially if you're reporting a bug to PyTorch). For even more developer context, set TORCH_LOGS="+dynamo"
(RayWorkerWrapper pid=73437, ip=192.168.13.143) ERROR 07-21 12:38:07 [worker_base.py:622]
(RayWorkerWrapper pid=73437, ip=192.168.13.143) INFO 07-21 12:37:47 [cuda.py:311] Cannot use FlashAttention-2 backend for Volta and Turing GPUs. [repeated 2x across cluster]
(RayWorkerWrapper pid=73437, ip=192.168.13.143) INFO 07-21 12:37:47 [cuda.py:360] Using XFormers backend. [repeated 2x across cluster]
(RayWorkerWrapper pid=64729, ip=192.168.13.142) INFO 07-21 12:37:48 [__init__.py:1152] Found nccl from library libnccl.so.2 [repeated 2x across cluster]
(RayWorkerWrapper pid=64729, ip=192.168.13.142) INFO 07-21 12:37:48 [pynccl.py:70] vLLM is using nccl==2.26.2 [repeated 2x across cluster]
(RayWorkerWrapper pid=73437, ip=192.168.13.143) WARNING 07-21 12:37:48 [custom_all_reduce.py:85] Custom allreduce is disabled because this process group spans across nodes. [repeated 2x across cluster]
(RayWorkerWrapper pid=28215, ip=192.168.13.141) INFO 07-21 12:37:48 [parallel_state.py:1076] rank 1 in world size 4 is assigned as DP rank 0, PP rank 0, TP rank 1, EP rank 1 [repeated 2x across cluster]
(RayWorkerWrapper pid=73437, ip=192.168.13.143) INFO 07-21 12:37:48 [model_runner.py:1171] Starting to load model /mnt/nfs/models/Qwen2.5-VL-7B-Instruct/... [repeated 2x across cluster]
(RayWorkerWrapper pid=73437, ip=192.168.13.143) WARNING 07-21 12:37:48 [config.py:1633] Possibly too large swap space. 16.00 GiB out of the 31.19 GiB total CPU memory is allocated for the swap space. [repeated 2x across cluster]
(RayWorkerWrapper pid=73437, ip=192.168.13.143) INFO 07-21 12:37:51 [default_loader.py:272] Loading weights took 2.99 seconds [repeated 2x across cluster]
(RayWorkerWrapper pid=73437, ip=192.168.13.143) INFO 07-21 12:37:51 [model_runner.py:1203] Model loading took 3.9972 GiB and 3.164046 seconds [repeated 2x across cluster]