Init DeepSeek-R1 using Offline Batched Inference

hoangtnm · May 18, 2025, 9:33am

Hi all, I am following this guide to run vllm using python. However, as I understand for deepseek models, we also need “–enable-reasoning --reasoning-parser deepseek_r1” as this guide. I am just wondering how to enable reasoning with the python API?

RunLLM · May 18, 2025, 9:33am

To enable reasoning for DeepSeek models using the vLLM Python API, you must start the vLLM server with the --reasoning-parser deepseek_r1 flag (the --enable-reasoning flag is now deprecated and not required). The Python client itself does not control reasoning; it simply connects to the server, which must be started with the appropriate reasoning parser enabled. There is no Python API parameter to enable reasoning—this is a server-side configuration only.

For example, launch the server with:

vllm serve deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --reasoning-parser deepseek_r1

Then, use the OpenAI-compatible Python client as usual. The server will return outputs with the reasoning_content field for supported models. See Reasoning Outputs documentation and example code for details.

Would you like more detail or a full code example?

Sources:

hoangtnm · May 18, 2025, 9:39am

how to do this with native python API of vllm, not OpenAI API?

RunLLM · May 18, 2025, 9:40am

There is no documented way to enable reasoning outputs using the native vLLM Python API (LLM class or similar); reasoning support is only available via the OpenAI-compatible server interface, not the direct Python API. The native API does not expose parameters for reasoning or reasoning parsers, and all official examples and documentation use the OpenAI API interface for this feature (see docs, see example).

If you need reasoning outputs, you must use the OpenAI-compatible HTTP server and client. The native Python API does not support this as of the latest documentation.

Would you like more detail or a workaround suggestion?

Sources:

Topic		Replies	Views
DeepSeek-V3 tool_choice="auto", not working but tool_choice="required" is working Tool Calling	1	52	July 4, 2025
Trace requests through vLLM v1 General	1	18	May 29, 2025
The vllm/vllm-openai version 0.9.1 is nearly 30% faster compared to lmsysorg/sglang:v0.4.7.post, but it stops running every two to three hours DeepSeek	0	98	June 23, 2025
How to pass vLLM specific parameters via openai API from clients General	2	29	June 18, 2025
Does the latest version support deepseek-v3 tool call Model Support	0	73	April 12, 2025

Init DeepSeek-R1 using Offline Batched Inference

Related topics