Is structured output compatible with automatic prefix caching?

g-eoj · April 14, 2025, 6:50pm

Imagine a scenario where there the prompt tokens are always the same but the extra_body={"guided_regex": argument can potentially change with a new request.

robertshaw · April 14, 2025, 6:56pm

Prefix caching just makes prompt processing faster. It has no impact on which tokens are generated.

Therefore it is compatible with structured generation, which only impacts the generation phase

Topic		Replies	Views
General questions on structured output backend Structured Outputs	9	1125	September 3, 2025
Will using Prefix Cache cause LLM to generate repetitive responses? General	2	197	November 27, 2025
How to adjust prefix-caching timeout General	2	619	September 25, 2025
Kv cache when disable Prefix Caching General	1	797	May 22, 2025
Using openai compatible with `beta.chat.completions.parse` can't do tool call and structured output together General	0	130	April 6, 2025

Is structured output compatible with automatic prefix caching?

Related topics