|
The vllm/vllm-openai version 0.9.1 is nearly 30% faster compared to lmsysorg/sglang:v0.4.7.post, but it stops running every two to three hours
|
|
0
|
170
|
June 23, 2025
|
|
Gemma 3 prefix caching in case of multimodal prompts
|
|
4
|
267
|
May 22, 2025
|
|
Will vLLM follow-up DeepSeek's inference system
|
|
3
|
565
|
May 13, 2025
|
|
Add Support for GLM-4 series model
|
|
1
|
181
|
April 16, 2025
|
|
Why does phi3 implementation in vLLM inherit from llama?
|
|
1
|
59
|
April 14, 2025
|
|
Does the latest version support deepseek-v3 tool call
|
|
0
|
125
|
April 12, 2025
|