How to use prompt (prompt and prompt_name) for EmbeddingGemma using vllm

hustxiayang · February 4, 2026, 6:29pm

@RunLLM it’s recommendated to include prompt and prompt_name as illustrated in https://ai.google.dev/gemma/docs/embeddinggemma/inference-embeddinggemma-with-sentence-transformers#using_prompts_with_embeddinggemma, how to use them in vllm?

RunLLM · February 4, 2026, 6:30pm

To use prompt and prompt_name as in the EmbeddingGemma example, you must manually apply the same prompt template in vLLM, since vLLM does not natively support prompt_name or automatic prompt templating for embeddings. You should format your input string according to the template used by Sentence Transformers for the desired task (e.g., “task: search result | query: …”) and then pass this formatted string as the prompt to vLLM’s embedding API or LLM.embed() method. There is no direct prompt_name parameter in vLLM for embeddings—manual prompt construction is required for consistency with Sentence Transformers outputs, as confirmed in this vLLM issue discussion.

Would you like a step-by-step example of how to construct and use such prompts in vLLM?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
Prompt_embeds usage in vllm openai completion api Multi-modality	4	207	June 17, 2025
Pass instructions to Qwen Embedding / Reranker via OpenAI-compatible server? Features	5	817	September 11, 2025
How to do inference of BGE-m3 embedding with vllm General	1	884	August 27, 2025
Text to speech support with /v1/audio/speech route Model Support	1	716	November 28, 2025
How to start embedding models General	7	472	March 20, 2025

How to use prompt (prompt and prompt_name) for EmbeddingGemma using vllm

Related topics