Is DeepSeek R1-0528 function call chat template supported BFCL multi turn benchmark?

X1ll · July 8, 2025, 2:55am

I try to run BFCL_v3_MultiTurn benchmark in FC mode with DeepSeek R1-0528.
However, the results are not ideal compared to the deepseek official outcomes，which BFCL_v3_MultiTurn (Acc) is 37.0%.
Is DeepSeek R1-0528 function call chat template supported BFCL multi turn benchmark?

RunLLM · July 8, 2025, 2:55am

DeepSeek R1-0528 is supported for function call (FC) benchmarks in vLLM, and there is an official function call chat template for it. However, multiple users have reported that, while single-turn function call accuracy is high, the model fails most multi-turn BFCL_v3_MultiTurn benchmark cases, with accuracy as low as 4–6%—much lower than DeepSeek’s official results. This is a known issue and has been observed in both vLLM and community evaluations, even when using the recommended template and parser (see discussion and template usage here, issue report).

In summary: DeepSeek R1-0528’s function call template is supported for BFCL multi-turn benchmarks in vLLM, but multi-turn accuracy is currently poor and does not match DeepSeek’s official outcomes. This appears to be a limitation of the current vLLM template/parser or model behavior, not a lack of support. Would you like a step-by-step breakdown of the setup or more detail on the known issues?

Sources:

Topic		Replies	Views
The vllm/vllm-openai version 0.9.1 is nearly 30% faster compared to lmsysorg/sglang:v0.4.7.post, but it stops running every two to three hours DeepSeek	0	128	June 23, 2025
Init DeepSeek-R1 using Offline Batched Inference General	3	161	May 18, 2025
DeepSeek-V3 tool_choice="auto", not working but tool_choice="required" is working Tool Calling	3	140	August 25, 2025
Does the latest version support deepseek-v3 tool call Model Support	0	79	April 12, 2025
Speculative decoding is currently incompatible with pipeline parallelism General	3	147	July 8, 2025

Is DeepSeek R1-0528 function call chat template supported BFCL multi turn benchmark?

Related topics