Loading non-consolidated Mistral safetensors

You can load models from a local directory by specifying the absolute path to the model folder (containing config.json, model weights, etc.) in the model argument, instead of using a Hugging Face model ID. This works as long as the directory structure matches what vLLM expects (docs, docs).

If your model is not in Hugging Face format, you must convert it to the expected format (e.g., using save_pretrained from transformers) before loading. Directly passing an in-memory model object is not supported (docs).

Would you like more detail on directory structure or conversion steps?

Sources:

Hint: Mention @RunLLM in the post for followups.