Value error Conversation roles must alternate

youlearnit · August 14, 2025, 1:10pm

Hi,

Why this could appear? Tried to do multiple prompts same time.
Is this application side error?

ValueError: Conversation roles must alternate user/assistant/user/assistant/

RunLLM · August 14, 2025, 1:10pm

This error occurs because vLLM enforces strict alternation between user and assistant roles in the conversation history: after an optional system message, the roles must alternate as user/assistant/user/assistant, etc. If you send two user or two assistant messages in a row, vLLM will raise a ValueError with this message. This is not an application bug, but a requirement of vLLM’s chat template logic for most instruction-tuned models, and is enforced both in offline and OpenAI-compatible serving modes. Some models (like Mistral) are especially strict about this alternation, while others (like Llama) may be more permissive, but vLLM’s default behavior is to require alternation for all models unless a custom chat template is provided or the model’s template allows otherwise.

To resolve this, ensure your message list alternates roles correctly. For example: [system, user, assistant, user, assistant, ...]. If you need more flexible role ordering, vLLM currently does not support it out-of-the-box, but you can override the chat template with the --chat-template CLI option to relax this restriction, though model performance may degrade if the model was not trained for such input patterns. See further details in the linked issues and documentation for examples and workarounds. Would you like more detail or code examples?

Sources:

hmellor · August 14, 2025, 1:15pm

@RunLLM please explain how to send multiple prompts using the LLM interface and the vllm serve interface. The user is trying to send multiple prompts but is inadvertantly sending them in the same messages list.

RunLLM · August 14, 2025, 1:16pm

To send multiple prompts with the LLM interface, pass a list of prompt strings to llm.generate(prompts=[...]) for batch inference. Each prompt should be a separate string in the list, not combined in a single conversation/messages list—this enables vLLM to process them in parallel and return outputs in the same order as the input prompts. For chat-style models, use llm.chat() with one conversation per call, not multiple conversations in a single messages list. See API docs.

With the vllm serve (OpenAI-compatible) interface, the Completions API supports batched prompts by sending a JSON payload with a “prompt” field as a list of strings. However, the Chat Completions API does not support batching: you must send one conversation (messages list) per request. If you try to send multiple conversations in a single messages list, you’ll get an error. See vLLM Issue #16965 and Quickstart docs.

Would you like a code example for either interface?

Sources:

hmellor · August 14, 2025, 1:18pm

Explain how to send multiple chat style prompts using LLM and how to send parallel requests using the openai API

RunLLM · August 14, 2025, 1:19pm

To send multiple chat-style prompts using the LLM interface, call llm.chat() with a list of conversations, where each conversation is a list of messages (role/content dicts). For example, llm.chat([conversation1, conversation2, ...]) will process each conversation as a separate prompt and return a list of outputs in the same order. See LLM.chat API docs.

For the OpenAI-compatible API (vllm serve), the /v1/chat/completions endpoint only supports one conversation per request. To send multiple chat prompts in parallel, you must send multiple HTTP requests concurrently (e.g., using threads or asyncio). Each request should contain a single messages list. See vLLM Issue #16965 and Quickstart docs.

Would you like code examples for either method?

Sources:

Topic		Replies	Views
Prompt_embeds usage in vllm openai completion api Multi-modality	4	56	June 17, 2025
Multimodal inference guideline? General	59	566	August 6, 2025
The OpenAI endpoint doesn't support function strict setting General	17	78	August 20, 2025
Trace requests through vLLM v1 General	1	38	May 29, 2025
Multiple tools with Mistral Large 2411 Tool Calling	4	145	March 26, 2025

Value error Conversation roles must alternate

Related topics