Config file not found Qwen/Qwen3.6-35B-A3B

Using vllm/vllm-openai:cu130-nightly with 2x5090 getting this:

trying to run Qwen/Qwen3.6-35B-A3B

vllm 0.19.2
WARNING 04-19 09:31:42 [fused_moe.py:1090] Using default MoE config. Performance might be sub-optimal! Config file not found at /usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/fused_moe/configs/E=256,N=256,device_name=NVIDIA_GeForce_RTX_5090,dtype=fp8_w8a8.json

This warning means vLLM is using a generic (untuned) MoE kernel config for your RTX 5090, which may reduce performance. To resolve, you need to tune the MoE kernel for your hardware using the benchmark_moe.py script, then set the VLLM_TUNED_CONFIG_FOLDER environment variable to the directory containing the generated config file. This will load the optimized config and improve throughput and latency (Qwen3-Next Usage Guide, fused_moe.py source).

Would you like detailed step-by-step instructions for tuning and deploying with the optimized config?

Sources:

Hint: Mention @RunLLM in the post for followups.