昇腾920b是否支持通义千问2.5-vl

whwususu · September 28, 2025, 2:17am

vllm-ascend支持通义千问2.5-VL的推理吗

RunLLM · September 28, 2025, 2:17am

vllm-ascend（昇腾NPU版vLLM）已支持Qwen2.5-VL等多模态大模型的推理，但具体兼容性和性能表现需参考vllm-ascend官方仓库说明。若遇到推理速度慢等问题，建议在vllm-ascend仓库提交issue获取支持。根据社区反馈，Qwen2.5-VL-32B-Instruct在2张Ascend 910卡上可运行，但速度约8~9 tokens/s，属于正常范围。如需进一步优化或遇到兼容性问题，建议关注该仓库的更新和FAQ。

如需详细配置和使用方法，请参考vllm-ascend官方文档或在相关issue区提问。相关讨论已建议将问题反馈至vllm-ascend仓库。

Sources:

相关讨论

Hint: Mention @RunLLM in the post for followups.

yikun · October 2, 2025, 10:40am

Please refer to ^

Topic		Replies	Views
我能在Ascend310B芯片上通过vllm-ascend插件部署Qwen2.5-vl吗？ Ascend Support	3	187	October 15, 2025
How can vllm ascend support qwen3-vl-235b? Ascend Support	2	268	October 16, 2025
Vllm-ascend是否支持async推理？ Ascend Support	2	123	October 15, 2025
Can Support Qwen3-VL or Qwen2.5 VL 72B on Vllm-ascend 0.11.0? Ascend Support	2	248	November 20, 2025
Question about vllm-ascend performance on server with 8*910B3 Ascend Support	5	641	May 28, 2025

昇腾920b是否支持通义千问2.5-vl

Related topics