I used to use langchain as the client, but it does not support looking stop_reason field so I had to give up it
what should I use now, openai python lib?
r there any recommended clients for vllm?
The recommended client for vLLM is the official OpenAI Python client, as vLLM’s OpenAI-compatible server is designed to work seamlessly with it and supports all relevant response fields, including stop_reason and finish_reason. You can use the OpenAI client by setting the api_key and base_url to your vLLM server; see the official example. LangChain is also supported via its OpenAI-compatible integration, but as you noted, it may not expose all fields like stop_reason directly in its API (discussion).
If you need full access to all OpenAI response fields, including stop_reason, the OpenAI Python client is the most reliable choice. For benchmarking or advanced usage, you can also use HTTP clients like requests or aiohttp, as shown in vLLM’s benchmarking scripts. Would you like more detail or code examples?
Sources:
- OpenAI Completion Client — vLLM
- Langchain Support
- Performance: 5x slower throughput with openAI client/server than native one
Hint: Mention @RunLLM in the post for followups.
Thanks to it’s wider compatibility, I’d suggest using OpenAI Python SDK. vLLM already comes with OpenAI server so that also helps.