How to add a new sampler method into the current vllm code

jivjots · March 13, 2026, 3:20am

The current vllm has a autoregressive sampler right ?
How can to add a new sampler that goes with the generate method

RunLLM · March 13, 2026, 3:21am

Yes, vLLM currently uses an autoregressive sampler as part of its generation pipeline. To add a new sampler that integrates with the generate method, you need to implement a new Sampler class (subclassing torch.nn.Module) and ensure it follows the interface and processing steps described in the vLLM documentation. The sampler should process logits and sampling metadata, then output the next tokens accordingly. See the Sampler API reference and design docs for details on the required methods and integration points.

You will need to update the model’s sampling logic to use your new sampler, typically by modifying the relevant model runner or engine code to instantiate and call your custom sampler during generation. Would you like a step-by-step guide or code example for this process?

Sources:

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
How to obtain the logits of LLM General	31	2630	July 21, 2025
Is there a newly example to show how to add a new LLM into vLLM? General	3	88	June 10, 2025
How to get probs directly General	11	893	July 28, 2025
Explain the relationship between SamplingParams and SamplingMetadata General	14	203	July 21, 2025
Using vLLM on a HF model architecture modified locally Model Support	1	242	July 7, 2025

How to add a new sampler method into the current vllm code

Related topics