Can we work with Wan2.2 model with vllm?

I was planning to work with Wan2.2 model with vLLM some open sources are telling that it won’t support. Can you help me how to do this process?

To check if a model like “Wan2.2” is supported by vLLM, you should first try loading it directly with vLLM using the LLM class. If vLLM returns text for a generative model, it is supported; otherwise, you may need to implement support for it yourself. The easiest way is to run:

from vllm import LLM
llm = LLM(model="path_or_name_of_Wan2.2")
output = llm.generate("Hello, my name is")
print(output)

If this fails with an error about unsupported architecture, you will need to follow the vLLM model integration guide, which involves adapting the model code to vLLM’s requirements (e.g., adding a prefix argument, implementing get_input_embeddings, and modifying the forward method). See the step-by-step guide at Implementing a Basic Model and Supported Models.

Would you like more detail on the integration process?

Sources:

but the wan2.2 model is related to text to vedio …

vLLM does not support text-to-video (diffusion-based) models like Wan2.2, as its optimizations are designed for autoregressive models, not diffusion or video generation models. There are currently no alternatives or plans to support such video generation models in vLLM; you should use other libraries (e.g., Hugging Face Transformers) for this purpose. See Wan2.2 model issue and video generation support discussion.

Would you like more detail on why vLLM does not support these models?

Sources: