Does vllm automatically inject schema's information into the prompt?

I use qewn3, and I don’t find any part about structured outputs in chat template file
so how does the model know the information of structured outputs like schema and description?does vllm inject it into prompts after chat template takes effect?or is it purely FSM-based decoding?

vLLM does not require the chat template to contain structured output schema or descriptions. Instead, when you request structured outputs (e.g., via JSON schema, regex, grammar, or structural tags), vLLM enforces the output format using constrained decoding (FSM-based), not by injecting schema or descriptions into the prompt after the chat template is applied. The schema is used internally by vLLM to constrain the model’s output tokens, ensuring the generated text matches the requested structure, regardless of the chat template content. For best results, you may still want to include format instructions in your prompt, but the enforcement itself is handled by the decoding backend, not prompt injection or chat template modification. See Structured Outputs documentation and example usage.

Would you like more detail on how FSM-based decoding works in vLLM?

Sources:

Hint: Mention @RunLLM in the post for followups.

@RunLLM
does that mean I don’t have to write description in schema?
the model can only see the field name to help it to know what to output?

Correct: the model itself does not see the schema or field descriptions unless you explicitly include them in the prompt. The FSM/constrained decoding enforces the output structure (e.g., field names, types), but descriptions in the schema are not visible to the model and thus do not guide its generation. Only the field names are enforced; descriptions are ignored unless you add them to the prompt manually. See Structured Outputs documentation and example code.

Would you like more detail or examples on how to include descriptions in prompts?

Sources:

Hint: Mention @RunLLM in the post for followups.