How to modify the cuda graph capture sizes via vllm plugin

Version: vllm 0.8.5
I’m working on a specific hardware backend for vllm plugin. I wanna modify the cudagraph_capture_size in platform.py:check_and_update_config()

Only if I uncomment the red line can it work and I wonder if it’s the correct way to modify the cudagraph_capture_size in plugin?

Directly setting compilation_config.cudagraph_capture_sizes inside check_and_update_config() is not the recommended way to modify cudagraph capture sizes in a vLLM plugin. The correct approach is to use the init_with_cudagraph_sizes() method, which is designed to update and deduplicate the sizes properly, as shown in your code. Setting the attribute directly may bypass internal checks and deduplication logic, leading to unexpected behavior.

According to the official API and implementation, you should call compilation_config.init_with_cudagraph_sizes(batch_sizes) to update the cudagraph capture sizes safely, rather than assigning to the attribute directly. This ensures the configuration is updated in a way that is consistent with vLLM’s internal logic and future compatibility (source, source).

Would you like a step-by-step breakdown of how init_with_cudagraph_sizes() works and why direct assignment is discouraged?

Sources: