|
About the Hardware Support category
|
|
0
|
97
|
March 20, 2025
|
|
MoE config on GH200
|
|
9
|
182
|
February 4, 2026
|
|
Running NVFP4 Nemotron model on Win11/WSL RTX 5080 + 5070 Ti
|
|
2
|
51
|
February 2, 2026
|
|
Vllm-ascend跑量化qwen2.5_7b问题
|
|
1
|
12
|
February 2, 2026
|
|
vLLM on RTX5090: Working GPU setup with torch 2.9.0 cu128
|
|
18
|
4192
|
January 13, 2026
|
|
Support for RTX 6000 Blackwell 96GB card
|
|
5
|
3329
|
January 5, 2026
|
|
关于vllm-ascend的性能采集的问题
|
|
1
|
31
|
January 5, 2026
|
|
How to apply FA4 on B200?
|
|
3
|
139
|
December 18, 2025
|
|
Npu 310p3 的生成速率
|
|
3
|
91
|
December 2, 2025
|
|
RTX PRO 6000 users seek help, LLAMA 4 NVFP4
|
|
1
|
159
|
November 25, 2025
|
|
Do we support NPU 310
|
|
3
|
104
|
November 21, 2025
|
|
Mindspeed训练完成后的模型部署问题
|
|
8
|
264
|
November 20, 2025
|
|
使用vllm_ascend0.9.1提示Failed to import vllm_ascend_C:
|
|
2
|
150
|
November 20, 2025
|
|
[Questions]Is there a plan to support the rerank model and embedding model
|
|
3
|
313
|
November 20, 2025
|
|
Vllm ascend能够查看模型的计算图吗
|
|
2
|
84
|
November 20, 2025
|
|
When will the next class be released?
|
|
2
|
69
|
November 20, 2025
|
|
Can Support Qwen3-VL or Qwen2.5 VL 72B on Vllm-ascend 0.11.0?
|
|
2
|
167
|
November 20, 2025
|
|
Vllm-ascend是否支持mineru2.5
|
|
2
|
85
|
November 20, 2025
|
|
RuntimeError: Int8 not supported on SM120. Use FP8 quantization instead, or run on older arch (SM < 100)
|
|
3
|
75
|
November 27, 2025
|
|
Need help compiling and running on Jetson Thor
|
|
4
|
503
|
November 1, 2025
|
|
How can vllm ascend support qwen3-vl-235b?
|
|
2
|
195
|
October 16, 2025
|
|
我能在Ascend310B芯片上通过vllm-ascend插件部署Qwen2.5-vl吗?
|
|
3
|
121
|
October 15, 2025
|
|
Vllm-ascend是否支持async推理?
|
|
2
|
103
|
October 15, 2025
|
|
昇腾920b是否支持通义千问2.5-vl
|
|
2
|
103
|
October 2, 2025
|
|
RTX Pro 6000 Tensor Parallelism CUBLAS_STATUS_ALLOC_FAILED
|
|
3
|
308
|
September 13, 2025
|
|
Vllm启动时,日志卡在nccl相关部分,不继续往下
|
|
15
|
1027
|
August 27, 2025
|
|
Is there any plan to organize the cuda-only configuration
|
|
1
|
36
|
August 15, 2025
|
|
Unable to use vLLM 0.10.1-gptoss on GH200 (aarch64) — source for custom wheel not available?
|
|
3
|
482
|
August 15, 2025
|
|
vLLM Benchmarking: Why Is GPUDirect RDMA Not Outperforming Standard RDMA in a Pipeline-Parallel Setup?
|
|
1
|
347
|
August 14, 2025
|
|
Why does quickReduce not need to use system-scope release write operations to update flags?
|
|
0
|
26
|
August 13, 2025
|