Qestion about Custom all-reduce(P2P Write)

In vLLM, the custom all-reduce is implemented by P2P Read. I’d like to ask if there is also an implementation for P2P Write? @ hanzhi713, could you please help answer this? Thank you!

It seems likely that vLLM’s custom all-reduce is implemented using P2P (peer-to-peer) Read, and there is no mention in the available documentation or code references of a P2P Write-based implementation. The design and discussions focus on P2P Read for efficiency and correctness, and there is no explicit support or alternative for P2P Write in the current vLLM custom all-reduce implementation, as confirmed by the absence of such an option in the API and code comments. If you need P2P Write, it appears this is not currently supported or documented in vLLM’s custom all-reduce path[^1][^2][^3].

Would you like more detail on the technical reasons or the implications of using P2P Read versus Write in this context?

Sources:

Hint: Mention @RunLLM in the post for followups.