DeepSeek-OCR-2 is known to crash on H20 GPUs with recent vLLM versions, including 0.20.0 and later, during engine startup or CUDA graph profiling. This is a recurring issue, and users have reported that both the latest and nightly vLLM Docker images fail to serve DeepSeek-OCR-2 on H20, with logs showing worker processes dying unexpectedly and no clear root cause in the traceback. There is currently no confirmed workaround or fix in the official documentation or issues for vLLM 0.23.0 on H20 for this model. See the discussion and logs in this issue.
Would you like more detail on possible troubleshooting steps or alternative configurations?