| About the Hardware Support category |   | 0 | 74 | March 20, 2025 | 
        
          | Need help compiling and running on Jetson Thor |     | 3 | 62 | October 19, 2025 | 
        
          | How can vllm ascend support qwen3-vl-235b? |       | 2 | 71 | October 16, 2025 | 
        
          | [Questions]Is there a plan to support the rerank model and embedding model |       | 2 | 46 | October 16, 2025 | 
        
          | 使用vllm_ascend0.9.1提示Failed to import vllm_ascend_C: |     | 1 | 30 | October 15, 2025 | 
        
          | 我能在Ascend310B芯片上通过vllm-ascend插件部署Qwen2.5-vl吗? |       | 3 | 42 | October 15, 2025 | 
        
          | Vllm-ascend是否支持async推理? |       | 2 | 29 | October 15, 2025 | 
        
          | MoE config on GH200 |     | 1 | 35 | October 13, 2025 | 
        
          | Support for RTX 6000 Blackwell 96GB card |     | 3 | 546 | October 8, 2025 | 
        
          | 昇腾920b是否支持通义千问2.5-vl |       | 2 | 49 | October 2, 2025 | 
        
          | RTX Pro 6000 Tensor Parallelism CUBLAS_STATUS_ALLOC_FAILED |     | 3 | 121 | September 13, 2025 | 
        
          | vLLM on RTX5090: Working GPU setup with torch 2.9.0 cu128 |         | 16 | 1756 | October 29, 2025 | 
        
          | Vllm启动时,日志卡在nccl相关部分,不继续往下 |     | 15 | 451 | August 27, 2025 | 
        
          | Is there any plan to organize the cuda-only configuration |     | 1 | 20 | August 15, 2025 | 
        
          | Unable to use vLLM 0.10.1-gptoss on GH200 (aarch64) — source for custom wheel not available? |       | 3 | 370 | August 15, 2025 | 
        
          | vLLM Benchmarking: Why Is GPUDirect RDMA Not Outperforming Standard RDMA in a Pipeline-Parallel Setup? |     | 1 | 135 | August 14, 2025 | 
        
          | Why does quickReduce not need to use system-scope release write operations to update flags? |   | 0 | 9 | August 13, 2025 | 
        
          | Mindspeed训练完成后的模型部署问题 |     | 7 | 129 | August 7, 2025 | 
        
          | Can vLLM built for old GPU (GT 630M) ? It may use CUDA 9.1.85 |     | 1 | 65 | August 4, 2025 | 
        
          | How to deploy vllm-ascend in AutoDL's 910B instance? |       | 7 | 236 | August 2, 2025 | 
        
          | GPU Time Slicing |   | 0 | 76 | July 16, 2025 | 
        
          | How to modify the cuda graph capture sizes via vllm plugin |     | 1 | 225 | July 1, 2025 | 
        
          | Can’t use ampere features |     | 1 | 89 | June 10, 2025 | 
        
          | KV Cache quantizing? |     | 3 | 381 | June 2, 2025 | 
        
          | Does vllm support inference or service startup of CPU small model? |     | 3 | 105 | May 30, 2025 | 
        
          | Struggling with my dual GPU setup. And getting chat template errors |     | 2 | 86 | May 30, 2025 | 
        
          | How to get torch-npu >= 2.5.1.dev20250308 |       | 3 | 320 | May 28, 2025 | 
        
          | Question about vllm-ascend performance on server with 8*910B3 |       | 5 | 236 | May 28, 2025 | 
        
          | Why is this not working? I corrected it but still |     | 1 | 501 | May 8, 2025 | 
        
          | Can anyone help me? Why is this not working? It used 😭 |     | 1 | 664 | May 8, 2025 |