Registering a Standalone Preprocessing Module for Model Acceleration

yeshangli · July 23, 2025, 4:23pm

Hello, I’d like to ask for some advice.

We are currently accelerating low-level operators for the preprocessing modules of certain specific models (such as Qwen2-VL). However, we do not intend to touch the Transformer part—because later on, we may also optimize models like InternVL, which do not have preprocessing modules on the Transformer side.

Additionally, we would prefer not to make intrusive modifications to your codebase.

Is there a feasible way to register only a preprocessing module, without interfering with the rest of the architecture?

RunLLM · July 23, 2025, 4:23pm

Yes, you can register only a custom preprocessing module for a model in vLLM without modifying the Transformer/model code. vLLM supports registering processor-related classes (such as a custom multimodal processor) independently via the MULTIMODAL_REGISTRY. You can define your own processor class and register it using the @MULTIMODAL_REGISTRY.register_processor decorator, specifying your processor, processing info, and dummy inputs builder. This approach allows you to accelerate or override preprocessing for specific models (e.g., Qwen2-VL, InternVL) without changing the core model or Transformer code, and without intrusive codebase modifications. See the official documentation for details and examples of this pattern for various models, including custom processors for models that lack a HuggingFace processor class on the Hub, like DeepSeek-VL2 and Qwen-VL[^1][^2].

Would you like a step-by-step example of how to implement and register such a processor?

Sources:

robertshaw · July 23, 2025, 4:30pm

We have been developing the concept of plugins for various components. Esepcially for models and hardware. I think that something like this could work well for pre-processors too

We would welcome a design proposal/RFC

RunLLM · July 23, 2025, 4:30pm

That’s a great idea—vLLM’s plugin system is already designed to support extensibility for models and hardware, and the same approach can be extended to preprocessing modules. The recommended pattern is to use Python entry points to register custom components, and for multimodal/model preprocessing, vLLM already supports registering processor classes via the MULTIMODAL_REGISTRY (see docs). A plugin interface for preprocessors could follow the same entry point mechanism as models and platforms, allowing out-of-tree packages to register new preprocessors without modifying the core codebase (plugin system design).

Would you like a concrete RFC template or example code for such a plugin interface?

Sources:

vLLM’s Plugin System

Topic		Replies	Views
Registering with Transformers Auto Classes for out-of-tree models General	1	24	July 29, 2025
Using vLLM on a HF model architecture modified locally Model Support	1	45	July 7, 2025
Is there a newly example to show how to add a new LLM into vLLM? General	3	20	June 10, 2025
Is Batch Inference for Multimodal Models Truly Batch Inference? General	9	174	August 1, 2025
Customized Model Generation Verification General	1	12	July 14, 2025

Registering a Standalone Preprocessing Module for Model Acceleration

Related topics