does my vllm docker install have a problem with 5090 because it says no xformers available?
(EngineCore_0 pid=110) INFO 09-13 16:09:20 [topk_topp_sampler.py:50] Using FlashInfer for top-p & top-k sampling.
(EngineCore_0 pid=110) Using a slow image processor as use_fast is unset and a slow processor was saved with this model. use_fast=True will be the default behavior in v4.52, even if the model was saved with a slow processor. This will result in minor differences in outputs. You’ll still be able to use a slow processor with use_fast=False.
(EngineCore_0 pid=110) INFO 09-13 16:09:27 [gpu_model_runner.py:1953] Starting to load model google/gemma-3-12b-it…
(EngineCore_0 pid=110) INFO 09-13 16:09:27 [gpu_model_runner.py:1985] Loading model from scratch…
(EngineCore_0 pid=110) INFO 09-13 16:09:27 [cuda.py:345] Using FlexAttention backend for head_size=72 on V1 engine.
(EngineCore_0 pid=110) WARNING 09-13 16:09:27 [layer.py:52] Xformers is not available, falling back.
(EngineCore_0 pid=110) INFO 09-13 16:09:27 [cuda.py:328] Using Flash Attention backend on V1 engine.
(EngineCore_0 pid=110) INFO 09-13 16:09:28 [weight_utils.py:296] Using model weights format [‘*.safetensors’]