Zerank - deploying using vllm

WarriorOfLiberation · October 29, 2025, 8:26am

How to host zeroentropy’s zerank-small using vllm?

RunLLM · October 29, 2025, 8:26am

To host zeroentropy’s zerank-small with vLLM, you need to use the OpenAI-compatible server by running the vllm serve command and specifying the model path. For example:

vllm serve zeroentropy/zerank-small

This will start the vLLM server and expose OpenAI-compatible endpoints for inference. You can then query the server using the OpenAI API format. For more details, see the Quickstart guide. If you have a local model directory, provide its path instead of the Hugging Face repo name.

Would you like more detail on querying the server or advanced configuration?

Sources:

Quickstart guide

Hint: Mention @RunLLM in the post for followups.

WarriorOfLiberation · October 29, 2025, 8:28am

@RunLLM its a reranker are you sure? Is it even supported

RunLLM · October 29, 2025, 8:29am

zerank-small is a reranker model, but as of now, it is not listed among the officially supported reranker/cross-encoder models in vLLM’s documentation or support tables. Only specific architectures like BGE, Qwen2, Qwen3, JinaVL, and others are confirmed as supported rerankers in vLLM. There is no mention of zeroentropy/zerank-small in the supported models list or reranker documentation.

Would you like more detail on how to check or request support for new models?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
Vllm serve Reranker General	1	931	September 17, 2025
Mixedbread reranker on vLLM `/score`: scores differ vs local Mixedbread; small payload = same order/different scores, large payload = different order Benchmarking	1	84	August 15, 2025
"served-model-name" and "model" General	6	1164	January 26, 2026
Text to speech support with /v1/audio/speech route Model Support	1	720	November 28, 2025
Pass instructions to Qwen Embedding / Reranker via OpenAI-compatible server? Features	5	821	September 11, 2025

Zerank - deploying using vllm

Related topics