I would like to ask if the vllm-omni team has plans to support minicpm-o-4.5. As the first full-duplex open-source multimodal model, I haven’t seen a roadmap from vllm-omni supporting minicpm-o-4.5 yet.
vLLM (mainline) already supports MiniCPM-o 4.5 for inference, including vision and audio input, but not audio output (TTS) yet. However, there is no explicit mention in the vllm-omni documentation or roadmap about planned support for MiniCPM-o-4.5 specifically—vllm-omni focuses on omni-modality models like Qwen-Omni, but aims to quickly adapt to new models as needed in coordination with model training teams. Future support for models like MiniCPM-o-4.5 in vllm-omni is possible but not officially announced or scheduled as of now. See vllm-omni architecture, FAQ, and RFC discussion.
Would you like more detail on current vLLM support for MiniCPM-o-4.5 or on how to request/vote for vllm-omni support?
Sources:
- MiniCPM-o 4.5 support in vLLM
- MiniCPM-o 4.5 audio output feature request
- vllm-omni architecture
- FAQ
- RFC discussion
Hint: Mention @RunLLM in the post for followups.