Features Quantization
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
|
About the Quantization category
|
|
0 | 27 | March 20, 2025 |
|
GPTQModel 能量化 GLM-5 FP16 到 INT8 吗
|
|
9 | 52 | April 24, 2026 |
|
GLM 5 / Kimi k2.5 on 4 x RTX 6000 Pro
|
|
1 | 151 | March 22, 2026 |
|
Why I feel cuda-kernel marlin run not fast?
|
|
5 | 133 | January 9, 2026 |
|
Asking 6-bit Quantization
|
|
1 | 149 | November 11, 2025 |
|
A bit of frustration with Quantization
|
|
5 | 666 | October 14, 2025 |
|
Support for Deploying 4-bit Fine-Tuned Model with LoRA on vLLM
|
|
13 | 833 | July 30, 2025 |
|
MoE quantization
|
|
9 | 1248 | July 2, 2025 |
|
W8a8两种量化方式有详细介绍吗
|
|
1 | 194 | June 15, 2025 |
|
GGUF quantized models Inference support
|
|
0 | 285 | March 25, 2025 |