GPT OSS in docker container

Hi there. I’m trying to run model gpt-oss-120b thru docker container like is described GPT OSS - vLLM Recipes .

After command: docker run --gpus all \ -p 8000:8000 \ --ipc=host \ vllm/vllm-openai:gptoss \ --model openai/gpt-oss-20b

I get error:

2025-08-19T09:36:03.399538Z WARN Reqwest(reqwest::Error { kind: Request, url: “https://cas-server.xethub.hf.co/reconstruction/c55846686d68918bb333fea55bb0c66fba7ff4aa4b2c3f34145a521ddf4050c6”, source: hyper_util::client::legacy::Error(Connect, ConnectFailed(ConnectError(“dns error”, Custom { kind: Uncategorized, error: “failed to lookup address information: Name or service not known” }))) }). Retrying…

I don’t know what I’m doing wrong - wget command return: 401 Unauthorized

Username/Password Authentication Failed.

Your error indicates the container cannot resolve or access Hugging Face URLs, likely due to missing or invalid authentication (Hugging Face token) or network/DNS issues. The official vLLM Docker instructions require mounting your Hugging Face cache and setting the HUGGING_FACE_HUB_TOKEN environment variable for private or gated models. Without a valid token, you may get 401 errors or DNS failures when the container tries to download model files. See official vLLM Docker docs and GPT-OSS recipe.

Example command (replace <your_token> with your Hugging Face token):

docker run --gpus all \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  --env "HUGGING_FACE_HUB_TOKEN=<your_token>" \
  -p 8000:8000 \
  --ipc=host \
  vllm/vllm-openai:gptoss \
  --model openai/gpt-oss-20b

Would you like more detail on troubleshooting network or authentication issues?

Sources: