How to to set sample params default for all requests like "truncate_prompt_tokens" to vllm embedding

How could I set the sampling param “truncate_prompt_tokens” to a default value like 16000?
https://docs.vllm.ai/en/v0.6.4/dev/sampling_params.html

Is there any way to set it for all requests?

I want to truncate all incomming embedding calls by default insteand of failing.

There is currently no global/default setting for truncate_prompt_tokens that applies to all requests in vLLM; you must set it per request. However, you can pass truncate_prompt_tokens (e.g., 16000) as a parameter in each embedding or pooling request to ensure prompts are truncated instead of failing when too long. Setting a global default is discussed but not implemented; the recommended approach is to set truncate_prompt_tokens in each request or via PoolerConfig if/when supported server-side in the future. See discussion in PR 24235 and issue 13489.

Would you like more detail or code examples for setting this parameter per request?

Sources:

Hint: Mention @RunLLM in the post for followups.