How to pass custom parameter between `Qwen2ForCausalLM` forward calls

By default, the forward function of Qwen2ForCausalLM (or some other decoder-only LLM) takes several pre-defined parameters as inputs:

def forward(
        self,
        input_ids: torch.Tensor,
        positions: torch.Tensor,
        kv_caches: List[torch.Tensor],
        attn_metadata: AttentionMetadata,
        intermediate_tensors: Optional[IntermediateTensors] = None,
        inputs_embeds: Optional[torch.Tensor] = None,
    ) -> Union[torch.Tensor, IntermediateTensors]:

I want to introduce some customed parameters received from past generation turns which will affect behavior of current forward (e.g. an external embedding that will be inject into model forwarding). But i can’t find an appropriate way to implement this. Can anyone help ?