|
About the Model Support category
|
|
0
|
152
|
March 20, 2025
|
|
GLM 5.1 PP support
|
|
1
|
41
|
May 9, 2026
|
|
How to extend the context length up to 1,010,000 tokens on Qwen3.5?
|
|
2
|
114
|
May 4, 2026
|
|
The latest version of vllm is not compatible with local deployment of deepseek-v4(0.20)
|
|
2
|
292
|
April 29, 2026
|
|
Issues with Voxtral models and omni
|
|
3
|
98
|
April 14, 2026
|
|
Support for MiniMax-M2.5
|
|
1
|
75
|
April 14, 2026
|
|
On 8-card Ascend 910B with vLLM serving Qwen3.5-122B-A10B, the client freezes at 8% progress when running accuracy test, as the server stops receiving new requests after Running reqs and KV Cache fall to 0.
|
|
1
|
77
|
April 11, 2026
|
|
Any project supported plan for minicpm-o-4.5?
|
|
1
|
51
|
March 26, 2026
|
|
Trying to run Qwen3.5-397B-A17B-GPTQ-Int4
|
|
10
|
419
|
March 13, 2026
|
|
Suggestion to improve inferencing speed
|
|
17
|
598
|
March 11, 2026
|
|
Critique my vLLM configuration for qwen3-coder-next
|
|
3
|
183
|
March 10, 2026
|
|
Hosting Qwen 3.5 35B-A3B model
|
|
1
|
1162
|
February 25, 2026
|
|
Mistral Small 3.2 finetune errors out: There is no module or parameter named 'language_model' in LlamaForCausalLM
|
|
3
|
465
|
February 18, 2026
|
|
Mistral-small-3.2: Unable to locate consolidated.safetensors.index.json
|
|
1
|
60
|
February 8, 2026
|
|
How to run Deep Seek OCR 2 in vllm
|
|
1
|
1185
|
January 27, 2026
|
|
Vllm-omni cannot load z-image-turbo
|
|
3
|
356
|
December 27, 2025
|
|
Llama 3.3 70B very slow
|
|
5
|
770
|
December 11, 2025
|
|
Text to speech support with /v1/audio/speech route
|
|
1
|
641
|
November 28, 2025
|
|
Using InternVL3 to perform OCR tasks yields worse results in vLLM than in LMDeploy
|
|
2
|
75
|
November 27, 2025
|
|
Serving minimax-m2
|
|
3
|
446
|
November 8, 2025
|
|
Disabling reasoning of Qwen3-VL-8B-Thinking per request
|
|
1
|
3372
|
October 29, 2025
|
|
Zerank - deploying using vllm
|
|
3
|
240
|
October 29, 2025
|
|
Vllm-ascend是否支持deepseek-ocr
|
|
2
|
327
|
October 21, 2025
|
|
Custom edit on the embedding out
|
|
2
|
60
|
October 15, 2025
|
|
Which ATTENTION BACKEND for gpt-oss in version 0.11.0?
|
|
1
|
514
|
October 4, 2025
|
|
Loading non-consolidated Mistral safetensors
|
|
3
|
427
|
September 30, 2025
|
|
Issue serving gemma3-27b-it
|
|
1
|
569
|
September 19, 2025
|
|
Progress bar to browser
|
|
0
|
49
|
September 11, 2025
|
|
Intermittent Service Downtime Issue with Magistral-Small-2506 Model on GPU VM
|
|
1
|
256
|
September 3, 2025
|
|
GPT OSS in docker container
|
|
1
|
329
|
August 19, 2025
|