How to get structured outputs in vllm?

@RunLLM
no, sorry, I said it the other way around
I mean, if I use code like:

from openai import OpenAI
from pydantic import BaseModel

client = OpenAI()

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

response = client.responses.parse(
    model="gpt-4o-2024-08-06",
    input=[
        {"role": "system", "content": "Extract the event information."},
        {
            "role": "user",
            "content": "Alice and Bob are going to a science fair on Friday.",
        },
    ],
    text_format=CalendarEvent,
)

event = response.output_parsed

it will use interface POST /v1/responses
and if I use code like:

from openai import OpenAI
from pydantic import BaseModel

class Car(BaseModel):
    brand: str
    model: str

json_schema = Car.model_json_schema()
client = OpenAI(base_url="http://localhost:8000/v1", api_key="-")
completion = client.chat.completions.create(
    model="Qwen3-4B-Instruct-2507",
    messages=[{"role": "user", "content": "Generate a JSON with the brand and model of a famous car."}],
    extra_body={"structured_outputs": {"json": json_schema}},
)
print(completion.choices[0].message.content)

that will use interface POST /v1/chat/completions


the former worked not very well