I have been trying out ASR models like Voxtral and Whisper. But I feel like changing the prompt parameter is not having any effect on the transcripton. OpenAI has provided a prompting guide for whisper model( Whisper prompting guide ). Is there a support for this in vLLM and also, does voxtral support the same?
For Whisper, vLLM supports the prompt parameter, which can be used to provide prior context or steer transcription style, similar to OpenAI’s guide. The prompt is passed as part of the input and is included in the decoder prompt for Whisper models in vLLM, so you can use it to influence the output as described in the OpenAI Whisper prompting guide and vLLM’s own examples and code.
For Voxtral, the prompt parameter is also supported, but its effect may differ from Whisper. Voxtral’s implementation in vLLM allows for prompts, but the model’s behavior and how it uses the prompt may not exactly match Whisper’s, as the architectures and training data differ. There is no explicit documentation stating Voxtral’s prompt handling is identical to Whisper’s in vLLM—so results may vary and you should test prompt effectiveness for your use case.
Would you like more detail or code examples?
Sources: