How to set CompilationConfig to SpeculativeConfig?

hei6775 · October 9, 2025, 9:33am

Hi everyone, I searched SpeculativeConfig and found no information about setting the CompileConfig for the draft model. I need to modify the CompileConfig for the draft model. How can I do that?

RunLLM · October 9, 2025, 9:33am

There is currently no documented or supported way to set or modify the CompilationConfig (CompileConfig) specifically for the draft model in vLLM’s SpeculativeConfig. The available configuration options for the draft model in SpeculativeConfig include quantization, max_model_len, revision, and code_revision, but not compilation or compile config parameters. This limitation is also noted as a TODO in the codebase, indicating that user control over draft model compilation is not yet implemented or exposed in the public API or CLI options. See discussion in this issue and SpeculativeConfig docs.

Would you like more detail on the current internal handling or possible workarounds?

Sources:

https://github.com/vllm-project/vllm/issues/13144

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
How to use speculative decoding? Speculative Decoding	3	560	May 1, 2025
How vllm v1 enable compile as default feature torch.compile integration	5	214	July 2, 2025
Cannot handle cases where distributed draft workers generate no tokens General	4	55	July 24, 2025
Does vllm support draft model use tp>1 when I use speculative decoding Speculative Decoding	1	106	July 29, 2025
Question: Specifying Medusa Choice Tree in vllm Speculative Decoding	1	66	July 11, 2025

How to set CompilationConfig to SpeculativeConfig?

Related topics