In the documentation of Gemma3, it is mentioned that you can omit the vision stack by running the gemma-3-27b-it model with Gemma3ForCausalLM instead of Gemma3ForConditionalGeneration. Is there a way to do this with vLLM? I tried overriding the models config.json, but I’m running into issues with missing configuration parameters.
AttributeError: 'Gemma3Config' object has no attribute 'num_hidden_layers'
Possibly similar to this github issue: AttributeError: 'Gemma3Config' object has no attribute 'vocab_size' · Issue #36683 · huggingface/transformers · GitHub