Is is possible to initialize an AsyncLLMEngine inside the LLM object?

The background is that we are using vllm as the backend for rollout in veRL for RL training. During rollout stage, we need to invoke several function calls inside each trajectory. Currently most implementations simply use multi-round form, where each form contains exactly one function call. This is will cause large latency when some trajectories are finished earlier and they have to wait the others before the next round generation.

My question is that if we could initialize the AsyncLLMEngine inside the LLM object so that we can use async method _add_request to decouple the external agent workflow management and internal generation process. Or if there is any other recommended workflow for this?

In V1 we have the AsyncLLM class you could use vllm/vllm/v1/engine/async_llm.py at main · vllm-project/vllm · GitHub. Does that suit your needs?

1 Like

Hi, Hmellor, I think this is good.

One followup question is that does current AsyncLLM support similar usage with this that we can access the handler of the model executor so that we can update parameters? I remember for AsyncLLMEngine the model executor is launched at the background so we cannot directly access it.

Thank you!

You can access the Engine at AsyncLLM.engine_core (which will be a AsyncMPClient(MPClient) class). I’m not sure how the model is accessed from these client classes though.

We are discussing with verl on how to support agent / multi-turn / toolcalling in RL, please follow the discussion in [Question] Is vLLMRollout.generate_sequences the right place to implement tool calling? · Issue #176 · volcengine/verl · GitHub .

1 Like