How to get structured outputs in vllm?

Septemberlemon · December 22, 2025, 9:16am

@RunLLM
I’ve start the server, it has the interface like:

GET:
Health
Get Server Load Metrics
Ping
Show Available Models
Show Version
Retrieve Responses
Metrics

POST:
Ping
Tokenize
Detokenize
Create Responses
Cancel Responses
Create Chat Completion
Create Completion
Create Embedding
Create Pooling
Create Classify
Create Score
Create Score V1
Create Transcriptions
Create Translations
Do Rerank
Do Rerank V1
Do Rerank V2
Scale Elastic Ep
Is Scaling Elastic Ep
Invocations

can I use structured outputs?

Topic		Replies	Views
Does vllm automatically inject schema's information into the prompt? General	3	104	December 23, 2025
Using guided decoding for JSON General	1	683	June 26, 2025
General questions on structured output backend Structured Outputs	9	1070	September 3, 2025
Using openai compatible with `beta.chat.completions.parse` can't do tool call and structured output together General	0	120	April 6, 2025
Qwen 3 VL System Prompt General	3	1901	October 15, 2025

How to get structured outputs in vllm?

Related topics