Vllm on new H20-3e Nvidia has occasional abnormal character like "📐", using Qwen 2.5 VL 72B

chenteddy · June 17, 2025, 9:04am

Two H20-3e Nvidia
Cuda 12.6 (Nvidia Driver 560 Ubuntu 22.04)
vllm version 0.9.1 latest
model : Qwen 2.5 VL 72B instruct

vllm when interface chating generates non-stop characters like “”,“” in the end.
Not everytime, the frequency is not low.
Does anyone meet the same problem ? How to solve this strange problem?

RunLLM · June 17, 2025, 9:05am

Yes, this issue has been reported by other users running vLLM with Qwen 2.5 VL 72B on H20-3e Nvidia cards, where chat outputs sometimes contain abnormal or garbled characters and may stop prematurely. The problem is not unique to your setup and has been discussed in the community, but there is no definitive solution documented in the official troubleshooting or FAQ as of now. Some users have noted that updating CUDA/cuBLAS libraries (e.g., ensuring only one cublas version >12.3 and installing nvidia-cublas-cu12==12.4.5.8) resolved similar issues on H20 cards, though the root cause is not fully explained and may relate to library or driver compatibility rather than vLLM itself.

If you haven’t already, try checking your CUDA/cuBLAS versions and update as suggested. Also, ensure you are not using safetensors with vLLM for this model, as it may cause issues. If the problem persists, consider searching or posting on the vLLM GitHub issues for further support, as this is a known but unresolved issue for some hardware/model combinations. Would you like more detailed troubleshooting steps or example commands for updating your environment?
Sources:

chenteddy · June 18, 2025, 4:47am

It is my fault. VLLM 0.9.1 supports Qwen 2.5 VL 72B model so well on H20-3e.
No problem.

RunLLM · June 18, 2025, 4:47am

Thank you for the update. If vLLM 0.9.1 is working well with Qwen 2.5 VL 72B on H20-3e and you are no longer experiencing issues, that’s great to hear. No further action is needed.

If you encounter any new issues or have questions about vLLM features or troubleshooting, feel free to ask.
Sources:

vllm running on new H20-3e Nvidia card has occasional garbled bug using Qwen 2.5 VL 72B

Topic		Replies	Views
Struggling with my dual GPU setup. And getting chat template errors NVIDIA GPU Support	2	24	May 30, 2025
Why is this not working? I corrected it but still NVIDIA GPU Support	1	156	May 8, 2025
Can anyone help me? Why is this not working? It used 😭 NVIDIA GPU Support	1	153	May 8, 2025
关于0.9.0和0.8.5dev948执行时的差异： General	9	116	May 30, 2025
Proper settings for running qwen2.5 72b on 48gb using awq General	1	26	June 21, 2025

Vllm on new H20-3e Nvidia has occasional abnormal character like "📐", using Qwen 2.5 VL 72B

Related topics