How to get thinking content in qwen3.5 thinking

LukeLIN · March 19, 2026, 7:52pm

How to get thinking content in qwen3.5 thinking?

RunLLM · March 19, 2026, 7:52pm

To get the “thinking” (reasoning) content in Qwen3.5 using vLLM, you must use the --reasoning-parser qwen3 flag when starting the server. By default, Qwen3.5 outputs reasoning steps in a separate field called reasoning (previously reasoning_content). When you make a chat completion request, the reasoning steps will be in the reasoning field of the response, and the final answer will be in the content field. For streaming, the reasoning appears in the delta.reasoning field of each chunk. Example:

from openai import OpenAI
client = OpenAI(api_key="EMPTY", base_url="http://localhost:8000/v1")
response = client.chat.completions.create(
    model="your-qwen3.5-model",
    messages=[{"role": "user", "content": "9.11 and 9.8, which is greater?"}]
)
reasoning = response.choices[0].message.reasoning
content = response.choices[0].message.content
print("reasoning:", reasoning)
print("content:", content)

For more details, see Reasoning Outputs documentation.

Would you like more detail?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
Deployment example for a qwen3 model with hybrid thinking Reasoning Outputs	10	1685	February 4, 2026
Disabling reasoning of Qwen3-VL-8B-Thinking per request Model Support	1	3351	October 29, 2025
Qwen3.5-27b-fp8没有think General	3	206	March 30, 2026
Thinking Token limit setting General	11	552	February 26, 2026
how to disable thinking for different model? General	1	3758	September 2, 2025

How to get thinking content in qwen3.5 thinking

Related topics