Gguf pypi release version 0.18.0 (27 feb)

SylvainVerdy · March 2, 2026, 7:23am

The GGUF package had not been updated since June 2025. Last week, I followed up with the llama.cpp team about releasing a new version of GGUF that includes constant 39 (MXFP4). I simply asked them to publish their latest code as an official release.

They published the updated package on February 27, 2026.

I hope this will make it easier to integrate GGUF with vLLM.

RunLLM · March 2, 2026, 7:23am

Thank you for the update. As of February 2026, the GGUF package has been updated to include new constants like MXFP4, which should improve compatibility and ease of integration with vLLM. This update addresses previous issues where vLLM and other projects required features or constants only available in unreleased versions of GGUF, as discussed in past pull requests and issues (see discussion).

If you plan to use GGUF models with vLLM, ensure you have the latest GGUF package installed. This should resolve prior compatibility problems and allow vLLM to support more quantization types and model architectures (see GGUF docs). Would you like more detail on how to use the updated GGUF package with vLLM?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
GGUF quantized models Inference support Quantization	0	301	March 25, 2025
How to run GGUF with vLLM and ROCM General	4	460	March 1, 2026
New VLLM Release General	1	283	November 3, 2025
Why latest rocm vllm is so bad? General	3	318	December 14, 2025
How to load specific quantized model General	1	1451	July 17, 2025

Gguf pypi release version 0.18.0 (27 feb)

Related topics