We are currently accelerating low-level operators for the preprocessing modules of certain specific models (such as Qwen2-VL). However, we do not intend to touch the Transformer part—because later on, we may also optimize models like InternVL, which do not have preprocessing modules on the Transformer side.
Additionally, we would prefer not to make intrusive modifications to your codebase.
Is there a feasible way to register only a preprocessing module, without interfering with the rest of the architecture?
Yes, you can register only a custom preprocessing module for a model in vLLM without modifying the Transformer/model code. vLLM supports registering processor-related classes (such as a custom multimodal processor) independently via the MULTIMODAL_REGISTRY. You can define your own processor class and register it using the @MULTIMODAL_REGISTRY.register_processor decorator, specifying your processor, processing info, and dummy inputs builder. This approach allows you to accelerate or override preprocessing for specific models (e.g., Qwen2-VL, InternVL) without changing the core model or Transformer code, and without intrusive codebase modifications. See the official documentation for details and examples of this pattern for various models, including custom processors for models that lack a HuggingFace processor class on the Hub, like DeepSeek-VL2 and Qwen-VL[^1][^2].
Would you like a step-by-step example of how to implement and register such a processor?
We have been developing the concept of plugins for various components. Esepcially for models and hardware. I think that something like this could work well for pre-processors too
That’s a great idea—vLLM’s plugin system is already designed to support extensibility for models and hardware, and the same approach can be extended to preprocessing modules. The recommended pattern is to use Python entry points to register custom components, and for multimodal/model preprocessing, vLLM already supports registering processor classes via the MULTIMODAL_REGISTRY (see docs). A plugin interface for preprocessors could follow the same entry point mechanism as models and platforms, allowing out-of-tree packages to register new preprocessors without modifying the core codebase (plugin system design).
Would you like a concrete RFC template or example code for such a plugin interface?