Deployment parameters for qwen3.5-4b?

@hustxiayang The deployment parameters depend on the metric you’re targeting. Are you focusing on TTFT, end-to-end latency or something else?

@hustxiayang what your hardware configuration? Is it H100 or something else?

How did you actually made qwen 3.5 - 4b to work at all. Does not work for me because of some qwen_35_text architecture, and a few more params in config. The same error as here - Can not deploy SFT Qwen3.5-9B model · Issue #44541 · huggingface/transformers · GitHub