How can I setup custom triton ops in vllm?

BoyaoLyu · October 9, 2025, 3:35am

I attempted to replace the Qwen3 FFN with my own fused Triton kernel. To do this, I created a new file model_executor/custom_ops/qwen3_fused_ffn.py, which defines both the kernel qwen3_swiglu_fused_kernel and its API wrapper qwen3_swiglu_fused. Then, I modified the forward method of the Qwen3MLP class in model_executor/models/qwen3.py to invoke qwen3_swiglu_fused. However, when I launch vllm serve, the process gets stuck right after printing Compiling a graph for general shape takes 59.56s.

How should I use my custom triton ops?

RunLLM · October 9, 2025, 3:36am

It seems likely that simply defining your custom Triton kernel and calling it in the model’s forward method is not enough for vLLM to properly register and use the op, especially with torch.compile and vLLM’s custom op infrastructure. vLLM expects custom ops to be registered following PyTorch’s guidelines, including schema registration, meta-function implementation, and proper integration with the compilation system. You should also ensure your op is registered as a subclass of CustomOp and that it is enabled in the CompilationConfig (see the custom_ops field) if you want it to be used during graph compilation. Otherwise, vLLM may fall back to tracing the entire function, causing long compile times or hangs, as you observed. See the contributing guide on adding or changing kernels and custom op documentation for details.

Would you like a step-by-step breakdown of how to properly register and use a custom Triton op in vLLM, including code snippets?

Sources:

Hint: Mention @RunLLM in the post for followups.

BoyaoLyu · October 9, 2025, 3:39am

Yes, please. I need a step-by-step breakdown of how to properly register and use a custom Triton op in vLLM, including code snippets?

Topic		Replies	Views
Vllm什么情况会使用triton General	6	611	June 12, 2025
How to preserve custom ops when enabling cuda graph? General	1	54	November 11, 2025
Vllm triton相关的配置有哪些？ General	3	233	June 12, 2025
Need help compiling and running on Jetson Thor NVIDIA GPU Support	4	586	November 1, 2025
Mukti-GPUs on vLLM using a custom network General	5	59	September 5, 2025

How can I setup custom triton ops in vllm?

Related topics