Question: Specifying Medusa Choice Tree in vllm

saif-amdouni · July 11, 2025, 11:02am

Description
I’m using vllm to load a model with a Medusa heads. My current implementation uses the following setup:

from vllm import SamplingParams
from vllm import EngineArgs, LLMEngine

MODEL_NAME = "JackFram/llama-68m"
SPEC_MODEL = "abhigoyal/vllm-medusa-llama-68m-random"

llm = LLM(
    model=MODEL_NAME,
    max_model_len=1024,
    speculative_config={
        "method" : "medusa",
        "model": SPEC_MODEL,
        "num_speculative_tokens": 3,
    },
    tensor_parallel_size=1,
    seed=0,
)
outputs = llm.generate(prompts=["Hi! How are you doing?", "Hi! How are you doing?"], use_tqdm=True)

Question
I want to know how to specify the Medusa choice tree for the model. Could you provide guidance or examples on how to do this?

**Environment: **

Python version: [3.11]
vllm version: [0.9.1]
OS: [ubuntu 22.04]

RunLLM · July 11, 2025, 11:02am

It seems likely that vLLM’s Medusa implementation currently only supports top-1 proposals and does not expose an option to specify or customize the Medusa choice tree (tree-style speculation) in the public API. The Medusa draft model class and related documentation state that only top-1 token proposals are supported, and there is no mention of a user-facing parameter for a choice tree or tree mask. This is also reflected in the MedusaWorker API, which does not document any tree configuration.

Would you like more detail on Medusa’s current capabilities or on how tree-style speculation might be supported in the future?

Sources:

Topic		Replies	Views
How to use speculative decoding? Speculative Decoding	3	129	May 1, 2025
Improving Speculative Decoding for Beginning Tokens & Structured Output Speculative Decoding	1	8	July 16, 2025
Does the vLLM v1 support Speculative Decoding now? V1 Feedback	4	180	March 25, 2025
Goodput Guided Speculative Decoding Speculative Decoding	2	120	April 19, 2025
Trace requests through vLLM v1 General	1	17	May 29, 2025

Question: Specifying Medusa Choice Tree in vllm

Related topics