How to pass custom parameter between `Qwen2ForCausalLM` forward calls

Lutalica · April 13, 2025, 2:32pm

By default, the forward function of Qwen2ForCausalLM (or some other decoder-only LLM) takes several pre-defined parameters as inputs:

def forward(
        self,
        input_ids: torch.Tensor,
        positions: torch.Tensor,
        kv_caches: List[torch.Tensor],
        attn_metadata: AttentionMetadata,
        intermediate_tensors: Optional[IntermediateTensors] = None,
        inputs_embeds: Optional[torch.Tensor] = None,
    ) -> Union[torch.Tensor, IntermediateTensors]:

I want to introduce some customed parameters received from past generation turns which will affect behavior of current forward (e.g. an external embedding that will be inject into model forwarding). But i can’t find an appropriate way to implement this. Can anyone help ？

Topic		Replies	Views
Customized Model Generation Verification General	1	66	July 14, 2025
Custom edit on the embedding out Model Support	2	55	October 15, 2025
Qwen2AudioEncoder General	1	21	November 19, 2025
Deployment parameters for qwen3.5-4b? General	21	140	March 30, 2026
Does the vLLM embedding model support inputting input_ids General	21	253	September 2, 2025

How to pass custom parameter between `Qwen2ForCausalLM` forward calls

Related topics