Deploying multi-node LLM with infiband/ROCE

RunLLM · August 13, 2025, 8:42pm

It seems likely that you are correct: the RayService CRD must specify GPU resources (e.g., nvidia.com/gpu: 8) in the resources section for both head and worker pods to allocate GPUs (KubeRay docs). For RDMA, the values for NCCL_IB_HCA and NCCL_SOCKET_IFNAME depend on the network interfaces provisioned by the Nvidia Network Operator, which is configured via the NvidiaClusterPolicy (formerly NimClusterPolicy).

NCCL_IB_HCA: This should be set to the device name of your InfiniBand/RDMA NIC, such as mlx5_0, mlx5_1, etc. You can find these by running ibv_devices or ibdev2netdev inside a pod with RDMA access, or by inspecting the spec.nicSelector section of your NvidiaClusterPolicy (look for device or rootDevices).
NCCL_SOCKET_IFNAME: This should be set to the network interface name (e.g., ib0, ens2f0, etc.) that is mapped to the RDMA device. You can find this by running ip a or ibdev2netdev inside the pod, or by checking the NvidiaClusterPolicy’s nicSelector and pfNames fields.

Example: If your NvidiaClusterPolicy has

nicSelector:
  pfNames: ["ens2f0"]
  rootDevices: ["0000:3b:00.0"]

then NCCL_SOCKET_IFNAME=ens2f0 and NCCL_IB_HCA=mlx5_0 (if mlx5_0 maps to ens2f0).

You can confirm the mapping with:

ibdev2netdev

which outputs lines like:

mlx5_0 port 1 ==> ens2f0 (Up)

So, set:

NCCL_IB_HCA=mlx5_0
NCCL_SOCKET_IFNAME=ens2f0

For more, see Nvidia Network Operator documentation and KubeRay integration. Would you like a full RayService CRD example with GPU and RDMA resources?

Sources:

Example with RDMA env vars

Topic		Replies	Views
Connecting to Ray Cluster and Launching vLLM with TP=8 and PP=2 in Deepseek R1 Mode on Kubernetes General	1	368	July 6, 2025
使用vllm ray 在多机多卡上部署推理服务 General	5	907	August 8, 2025
Multi-node K8s GPU pooling Features	3	526	July 17, 2025
Running vLLM multi-node data parallel with SLURM RL Integration	1	1116	August 12, 2025
vLLM cannot connect to existing Ray cluster Ray	16	1181	May 8, 2025

Deploying multi-node LLM with infiband/ROCE

Related topics