Why not add a self prefix ？

shiwanghua · October 15, 2025, 8:03pm

vllm/model_executor/layers/fused_moe/layer.py

cfded8079


      
                      w1=layer.w13_weight,
                      w2=layer.w2_weight,
                      topk=top_k,
                      gating_output=router_logits,
                      global_num_experts=global_num_experts,
                      expert_map=expert_map,
                      renormalize=renormalize,
                  )
          
              if current_platform.is_tpu():
                  forward_native = forward_tpu
              elif current_platform.is_cpu():
                  forward_native = forward_cpu
              elif current_platform.is_xpu():
                  forward_native = forward_xpu
              else:
                  forward_native = forward_cuda
          
          
          def determine_expert_map(
              ep_size: int,

RunLLM · October 15, 2025, 8:03pm

The determine_expert_map function in vllm/model_executor/layers/fused_moe/layer.py calculates how many experts are assigned to each rank for expert parallelism (EP) and creates a mapping from global to local expert indices. Experts are distributed as evenly as possible across ranks, with any remainder assigned to the first few ranks. If ep_size is 1, all experts are local and no mapping is needed; otherwise, a tensor maps global expert indices to local ones, with -1 for experts not assigned to the current rank. The function returns a tuple: (local_num_experts, expert_map).

For more details, see the function docstring and implementation in vllm/model_executor/layers/fused_moe/layer.py and the API documentation at vllm.model_executor.layers.fused_moe.layer.determine_expert_map.

Would you like a step-by-step breakdown or code example?

Sources:

vllm/model_executor/layers/fused_moe/layer.py

Hint: Mention @RunLLM in the post for followups.

Topic		Replies	Views
为什么ep的 size=tp*dp General	8	96	November 25, 2025
[Question] Confirmation on Physical Data Movement and Runtime Dynamics in rearrange_expert_weights_inplace General	5	42	November 24, 2025
How to output selected expert IDs of prefilling? General	2	34	December 25, 2025
Expert offloading Features	1	327	November 11, 2025
[Question] Where is the cross-rank communication/dispatch logic located for MoE layers? General	1	49	November 24, 2025

Why not add a self prefix ？

Related topics