So I am right now in the process of adding some guide custom model support to VLLM which requires me to keep track of which input_ids belong to which request_id. My assumption is that something similar is happening for the positions. So I tried to track back how they are set and updated.
positions seems to be a slice of self.positions which is updated with self.positions_cpu. But I just can’t figure out how self.positions_cpu is being updated. I string searched the entire code base and it is not a common variable name. So my assumption is it somewhere needs to be updated in place, but I have no idea where.
can someone maybe explain to me how this is being handled?
Thank you for your help in advance!