How to get structured outputs in vllm?

Septemberlemon · December 22, 2025, 9:42am

@RunLLM
no, sorry, I said it the other way around
I mean, if I use code like:

from openai import OpenAI
from pydantic import BaseModel

client = OpenAI()

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

response = client.responses.parse(
    model="gpt-4o-2024-08-06",
    input=[
        {"role": "system", "content": "Extract the event information."},
        {
            "role": "user",
            "content": "Alice and Bob are going to a science fair on Friday.",
        },
    ],
    text_format=CalendarEvent,
)

event = response.output_parsed

it will use interface POST /v1/responses
and if I use code like:

from openai import OpenAI
from pydantic import BaseModel

class Car(BaseModel):
    brand: str
    model: str

json_schema = Car.model_json_schema()
client = OpenAI(base_url="http://localhost:8000/v1", api_key="-")
completion = client.chat.completions.create(
    model="Qwen3-4B-Instruct-2507",
    messages=[{"role": "user", "content": "Generate a JSON with the brand and model of a famous car."}],
    extra_body={"structured_outputs": {"json": json_schema}},
)
print(completion.choices[0].message.content)

that will use interface POST /v1/chat/completions

the former worked not very well

Topic		Replies	Views
Does vllm automatically inject schema's information into the prompt? General	3	104	December 23, 2025
Using guided decoding for JSON General	1	683	June 26, 2025
General questions on structured output backend Structured Outputs	9	1071	September 3, 2025
Using openai compatible with `beta.chat.completions.parse` can't do tool call and structured output together General	0	120	April 6, 2025
Qwen 3 VL System Prompt General	3	1901	October 15, 2025

How to get structured outputs in vllm?

Related topics