vLLM Forums

Topic	Replies	Views	Activity
Gguf pypi release version 0.18.0 (27 feb) General	1	145	March 2, 2026
How to run GGUF with vLLM and ROCM General	4	459	March 1, 2026
Following Qwen3.5 Usage Guide on H20 ,but can not host Qwen3.5-27B General	4	383	February 28, 2026
BranchContext: CoW filesystem isolation for multi-sample vLLM workflows General	1	26	February 27, 2026
How to serve two vLLM instance using docker? General	3	490	February 26, 2026
Thinking Token limit setting General	11	684	February 26, 2026
Hosting Qwen 3.5 35B-A3B model Model Support	1	1204	February 25, 2026
Vllm-openai DockerHub missing 0.16 tags General	2	117	February 25, 2026
vLLM-Ascend 是否支持reasoning-parser General	1	49	February 25, 2026
What is the role of the additional process running on GPU 0 in DP+EP? General	3	49	February 25, 2026
True eager backend General	6	121	February 24, 2026
Mistral Small 3.2 finetune errors out: There is no module or parameter named 'language_model' in LlamaForCausalLM Model Support	3	484	February 18, 2026
Is support diffusers.Pipeline's LoRA file? General	15	77	February 16, 2026
Disaggregated Prefilling中什么是tail itl General	6	26	February 16, 2026
Significant speedup observed with long common prefix between v0.11.0 and v0.12.0 General	9	103	February 13, 2026
Skip_leading_tokens 是什么意思 General	41	89	February 13, 2026
Priority in batch api General	7	345	February 12, 2026
We're Live: OCI Deployment Guide for vLLM Production Stack General	1	83	February 12, 2026
Native FP8 WMMA Support for AMD RDNA4 (RX 9070 XT / R9700) in vLLM General	5	2116	February 12, 2026
Distributed Speculative Decoding using Ray Speculative Decoding	3	127	February 11, 2026
Is the vllm repository currently equipped with nightly testing General	3	52	February 10, 2026
Qwen3-TTS Base模型问题 General	1	70	February 9, 2026
Mistral-small-3.2: Unable to locate consolidated.safetensors.index.json Model Support	1	76	February 8, 2026
Pre-Built Docker Install General	2	101	February 8, 2026
Does vLLM support shieldgemma? General	1	40	February 6, 2026
vLLM v0.15.1 failing when deployed in AWS General	3	812	February 6, 2026
三机部署glm-4.7（bf16）报错 General	3	75	February 6, 2026
推理时遇到的Memory占用问题 General	2	83	February 6, 2026
推理时报错RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {} General	1	578	February 5, 2026
找不到内存广播块的问题（No available shared memory broadcast block found in 60 seconds. This typically happens when some processes are hanging or doing some time-consuming work） General	1	1214	February 5, 2026