How to get thinking content in qwen3.5 thinking?
To get the “thinking” (reasoning) content in Qwen3.5 using vLLM, you must use the --reasoning-parser qwen3 flag when starting the server. By default, Qwen3.5 outputs reasoning steps in a separate field called reasoning (previously reasoning_content). When you make a chat completion request, the reasoning steps will be in the reasoning field of the response, and the final answer will be in the content field. For streaming, the reasoning appears in the delta.reasoning field of each chunk. Example:
from openai import OpenAI
client = OpenAI(api_key="EMPTY", base_url="http://localhost:8000/v1")
response = client.chat.completions.create(
model="your-qwen3.5-model",
messages=[{"role": "user", "content": "9.11 and 9.8, which is greater?"}]
)
reasoning = response.choices[0].message.reasoning
content = response.choices[0].message.content
print("reasoning:", reasoning)
print("content:", content)
For more details, see Reasoning Outputs documentation.
Would you like more detail?
Sources:
Hint: Mention @RunLLM in the post for followups.