|
About the Model Support category
|
|
0
|
110
|
March 20, 2025
|
|
How to run Deep Seek OCR 2 in vllm
|
|
1
|
419
|
January 27, 2026
|
|
Vllm-omni cannot load z-image-turbo
|
|
3
|
135
|
December 27, 2025
|
|
Llama 3.3 70B very slow
|
|
5
|
428
|
December 11, 2025
|
|
Suggestion to improve inferencing speed
|
|
1
|
173
|
December 4, 2025
|
|
Text to speech support with /v1/audio/speech route
|
|
1
|
283
|
November 28, 2025
|
|
Using InternVL3 to perform OCR tasks yields worse results in vLLM than in LMDeploy
|
|
2
|
55
|
November 27, 2025
|
|
Serving minimax-m2
|
|
3
|
304
|
November 8, 2025
|
|
Disabling reasoning of Qwen3-VL-8B-Thinking per request
|
|
1
|
1717
|
October 29, 2025
|
|
Zerank - deploying using vllm
|
|
3
|
128
|
October 29, 2025
|
|
Mistral Small 3.2 finetune errors out: There is no module or parameter named 'language_model' in LlamaForCausalLM
|
|
2
|
320
|
October 23, 2025
|
|
Vllm-ascend是否支持deepseek-ocr
|
|
2
|
306
|
October 21, 2025
|
|
Custom edit on the embedding out
|
|
2
|
54
|
October 15, 2025
|
|
Which ATTENTION BACKEND for gpt-oss in version 0.11.0?
|
|
1
|
402
|
October 4, 2025
|
|
Loading non-consolidated Mistral safetensors
|
|
3
|
262
|
September 30, 2025
|
|
Issue serving gemma3-27b-it
|
|
1
|
378
|
September 19, 2025
|
|
Progress bar to browser
|
|
0
|
32
|
September 11, 2025
|
|
Intermittent Service Downtime Issue with Magistral-Small-2506 Model on GPU VM
|
|
1
|
183
|
September 3, 2025
|
|
GPT OSS in docker container
|
|
1
|
281
|
August 19, 2025
|
|
Why does prefill use normal attention, while decode uses weight absorption in MLA?
|
|
1
|
151
|
August 5, 2025
|
|
Using vLLM on a HF model architecture modified locally
|
|
1
|
153
|
July 7, 2025
|
|
The vllm/vllm-openai version 0.9.1 is nearly 30% faster compared to lmsysorg/sglang:v0.4.7.post, but it stops running every two to three hours
|
|
0
|
164
|
June 23, 2025
|
|
Gemma 3 prefix caching in case of multimodal prompts
|
|
4
|
242
|
May 22, 2025
|
|
Will vLLM follow-up DeepSeek's inference system
|
|
3
|
540
|
May 13, 2025
|
|
Add Support for GLM-4 series model
|
|
1
|
160
|
April 16, 2025
|
|
Why does phi3 implementation in vLLM inherit from llama?
|
|
1
|
48
|
April 14, 2025
|
|
Does the latest version support deepseek-v3 tool call
|
|
0
|
114
|
April 12, 2025
|