Config file not found Qwen/Qwen3.6-35B-A3B

AnttiK · April 19, 2026, 9:34am

Using vllm/vllm-openai:cu130-nightly with 2x5090 getting this:

trying to run Qwen/Qwen3.6-35B-A3B

vllm 0.19.2
WARNING 04-19 09:31:42 [fused_moe.py:1090] Using default MoE config. Performance might be sub-optimal! Config file not found at /usr/local/lib/python3.12/dist-packages/vllm/model_executor/layers/fused_moe/configs/E=256,N=256,device_name=NVIDIA_GeForce_RTX_5090,dtype=fp8_w8a8.json

RunLLM · April 19, 2026, 9:35am

This warning means vLLM is using a generic (untuned) MoE kernel config for your RTX 5090, which may reduce performance. To resolve, you need to tune the MoE kernel for your hardware using the benchmark_moe.py script, then set the VLLM_TUNED_CONFIG_FOLDER environment variable to the directory containing the generated config file. This will load the optimized config and improve throughput and latency (Qwen3-Next Usage Guide, fused_moe.py source).

Would you like detailed step-by-step instructions for tuning and deploying with the optimized config?

Sources:

fused_moe.py source

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
MoE config on GH200 NVIDIA GPU Support	9	435	February 4, 2026
Hosting Qwen 3.5 35B-A3B model Model Support	1	1136	February 25, 2026
Warning while serving Qwen/Qwen3.6-35B-A3B-FP8 General	7	584	April 21, 2026
Trying to run Qwen3.5-397B-A17B-GPTQ-Int4 Model Support	10	394	March 13, 2026
Does vllm inference work with Qwen3-VL-30B General	3	445	November 30, 2025

Config file not found Qwen/Qwen3.6-35B-A3B

Related topics