vllm启动Qwen2.5-72B模型服务,能指定 --tool-call-parser hermes 参数配置吗?
我的启动命令如下:
export VLLM_ATTENTION_BACKEND=FLASHINFER
python3 -m vllm.entrypoints.openai.api_server --host 0.0.0.0 \
--port 8000 \
--model /models/Qwen2.5-72B \
--served-model-name Qwen2.5-72B --trust-remote-code \
--gpu-memory-utilization 0.85 \
-tp 4 \
--rope-scaling '{"rope_type":"yarn","factor":4.0,"original_max_position_embeddings":32768}' \
--max-model-len 60000 \
--tool-call-parser hermes --enable-auto-tool-choice
服务日志有如下报错:
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] Error trying to handle streaming tool call.
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] Traceback (most recent call last):
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 241, in extract_tool_calls_streaming
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] current_tool_call = partial_json_parser.loads(
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] ^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] File "/usr/local/lib/python3.12/dist-packages/partial_json_parser/core/api.py", line 15, in parse_json
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] return parser(ensure_json(json_string, allow_partial, use_fast_fix))
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] File "/usr/local/lib/python3.12/dist-packages/partial_json_parser/core/api.py", line 22, in ensure_json
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] head, tail = fix_fast(json_string, allow_partial)
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] File "/usr/local/lib/python3.12/dist-packages/partial_json_parser/core/myelin.py", line 50, in fix_fast
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] _i, _char = stack.pop()
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] ^^^^^^^^^^^
ERROR 07-15 02:17:44 [hermes_tool_parser.py:370] IndexError: pop from empty list
...
ERROR 07-15 05:01:51 [hermes_tool_parser.py:370] Error trying to handle streaming tool call.
ERROR 07-15 05:01:51 [hermes_tool_parser.py:370] Traceback (most recent call last):
ERROR 07-15 05:01:51 [hermes_tool_parser.py:370] File "/usr/local/lib/python3.12/dist-packages/vllm/entrypoints/openai/tool_parsers/hermes_tool_parser.py", line 294, in extract_tool_calls_streaming
ERROR 07-15 05:01:51 [hermes_tool_parser.py:370] self.prev_tool_call_arr[self.current_tool_id].get("arguments"))
ERROR 07-15 05:01:51 [hermes_tool_parser.py:370] ~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^
ERROR 07-15 05:01:51 [hermes_tool_parser.py:370] IndexError: list index out of range