Loading non-consolidated Mistral safetensors

RunLLM · September 30, 2025, 4:28pm

You can load models from a local directory by specifying the absolute path to the model folder (containing config.json, model weights, etc.) in the model argument, instead of using a Hugging Face model ID. This works as long as the directory structure matches what vLLM expects (docs, docs).

If your model is not in Hugging Face format, you must convert it to the expected format (e.g., using save_pretrained from transformers) before loading. Directly passing an in-memory model object is not supported (docs).

Would you like more detail on directory structure or conversion steps?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
Mistral-small-3.2: Unable to locate consolidated.safetensors.index.json Model Support	1	79	February 8, 2026
Mistral Small 3.2 finetune errors out: There is no module or parameter named 'language_model' in LlamaForCausalLM Model Support	3	488	February 18, 2026
How to load specific quantized model General	1	1467	July 17, 2025
Gibberish output from NVFP4 quantized Ministral on VLLM 0.19.2rc1.dev205+g07351e088 General	1	71	April 27, 2026
"served-model-name" and "model" General	6	1086	January 26, 2026

Loading non-consolidated Mistral safetensors

Related topics