推理Qwen3-VL-235B-A22B-Instruct-FP8时，

Jasper · December 22, 2025, 2:42pm

如果stop_token_ids=[]的话，会传入默认值吗，如果不显示的传入stop_token_ids会影响推理精度吗

代码如下
class ModelWorker():

def init(self, cfg, req_queue: mp.Queue, res_queue: mp.Queue, init_event: mp.Event) → None:

self.cfg = cfg

self.req_queue = req_queue

self.res_queue = res_queue

self.sampling_params = SamplingParams(

temperature=0.6,

top_p=0.9,

top_k=50,

repetition_penalty=1.05,

max_tokens=1024,

stop_token_ids=,

)

self.init_event = init_event

RunLLM · December 22, 2025, 2:42pm

如果 stop_token_ids=，则不会传入任何停止token，vLLM不会使用默认值，而是仅依赖EOS token或其他停止条件来终止生成，不会影响推理精度，但生成内容可能会更长或无法按预期提前终止。stop_token_ids 为空不会自动补充默认值，需显式指定需要的停止token，否则只会在遇到模型自身的eos_token_id时停止生成。据vLLM官方文档和源码，stop_token_ids默认为空列表，不会影响模型推理本身的精度。

是否需要详细解释 stop_token_ids 的机制和对生成流程的影响？

Sources:

vLLM官方文档 SamplingParams.stop_token_ids

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
How can I determine which specific stop token triggered the termination? General	3	166	December 10, 2025
How to custom end token in vllm serve cli? General	4	193	December 9, 2025
How to custom end token? General	2	96	December 4, 2025
Why does the size of token_ids generated by the large model during vllm inference exceed the length of the vocabulary? General	7	247	September 15, 2025
Why is it so slow to build a odeVLLM from source using Docker? General	39	606	January 17, 2026

推理Qwen3-VL-235B-A22B-Instruct-FP8时，

Related topics