Is there any plan to organize the cuda-only configuration

Hi vLLM community,

I noticed some logic specific to CUDA that appears to be running on all hardware. I added a hot fix [Core] Make cudagraph check cuda platform only by yaochengji · Pull Request #23005 · vllm-project/vllm · GitHub for this by checking current_platform.is_cuda_alike() before the corresponding logic.

I’m wondering if we have any on-going design to separate hardware-agonostic configuration and hardware-specific configuration?

I know some related code like current_platform.check_and_update_config(self), which is per-hardware only. But it seems not enough.

1 Like

Yes, vLLM is actively working on separating hardware-agnostic and hardware-specific configuration. The main approach is to use the platform abstraction layer, where each hardware backend implements its own logic in the platforms directory (e.g., current_platform.check_and_update_config). This allows hardware-specific configuration and checks to be encapsulated per backend, reducing scattered if-else logic and making the codebase more maintainable and extensible. However, the transition is ongoing, and not all configuration logic has been fully migrated—some CUDA-specific logic may still run on other hardware until the migration is complete. Your approach of guarding CUDA logic with current_platform.is_cuda_alike() is consistent with this direction and is recommended until the abstraction is fully in place. See platform abstraction RFC and platforms interface code for details.

Would you like a step-by-step breakdown of how to contribute to or extend this abstraction?

Sources: