Pre-Built Docker Install

I’m trying to use vLLM on a local server with 2x NVIDIA RTX 3060 (12gb each) GPUs.

I’m trying the Pre-Build docker:

docker run --runtime nvidia --gpus all
-v ~/.cache/huggingface:/root/.cache/huggingface
–env “HF_TOKEN=$HF_TOKEN”
-p 8000:8000
–ipc=host
vllm/vllm-openai:latest
–model Qwen/Qwen3-0.6B

root@sys-ng:~# docker run --gpus all \

-v ~/.cache/huggingface:/root/.cache/huggingface \

-e HF_TOKEN=$HF_TOKEN
-p 8000:8000
–ipc=host
vllm/vllm-openai
–tensor-parallel-size 2
–model Qwen/Qwen3-0.6B
Unable to find image ‘vllm/vllm-openai:latest’ locally
latest: Pulling from vllm/vllm-openai
66587c81b81a: Pull complete
f29b1d4013a9: Pull complete
340d44d2921c: Pull complete
59a4bcbddda3: Pull complete
6e8af4fd0a07: Pull complete
5fde6ec96d5f: Pull complete
d5c41c3b66f6: Pull complete
e392f915ed79: Pull complete
8638325b23df: Pull complete
d8b0d5c5f036: Pull complete
bde526ae4fd3: Pull complete
5016c10f4af8: Pull complete
a19e04924597: Pull complete
30f3a69bd2f3: Downloading [===============> ] 1.561GB/5.129GB
9b85752ae3df: Download complete
c8fd43168366: Download complete
0f6f885fe2ee: Download complete
2bbb7ba18787: Download complete
71974a01930c: Download complete
4d2e275b7825: Downloading [==========================================> ] 437.3MB/508.9MB
8e4934fbdb92: Downloading
af4d82598236: Waiting
09016baf7b99: Waiting
974f972651dd: Waiting
c2f79f080520: Waiting
a57979a1f045: Waiting
a4cd7924e1f2: Waiting
docker: unauthorized: authentication required.
See ‘docker run --help’.

I have logged into HuggingFace and docker as well:

root@sys-ng:~# hf auth login
User is already logged in.
root@eve-ng:~#
root@eve-ng:~#
root@eve-ng:~# docker login
Authenticating with existing credentials…
WARNING! Your password will be stored unencrypted in /root/.docker/config.json.
Configure a credential helper to remove this warning. See

docker login | Docker Docs

Login Succeeded

The error “docker: unauthorized: authentication required” means your Docker client cannot pull the vllm/vllm-openai:latest image from Docker Hub due to authentication issues, not HuggingFace or vLLM itself. This is unrelated to HuggingFace login or HF_TOKEN. You need to ensure you are logged in to Docker Hub with a valid account and have network access to Docker Hub. Try running docker login and enter your Docker Hub credentials, then re-run the docker run command. If the problem persists, check for typos in the image name or possible rate limits on Docker Hub (see docs).

Also, ensure you use standard double quotes (") and double dashes (–) in your command, not curly quotes or en-dashes, as these can cause parsing errors. For example:

docker run --runtime nvidia --gpus all \
  -v ~/.cache/huggingface:/root/.cache/huggingface \
  -e "HF_TOKEN=$HF_TOKEN" \
  -p 8000:8000 \
  --ipc=host \
  vllm/vllm-openai:latest \
  --tensor-parallel-size 2 \
  --model Qwen/Qwen3-0.6B

According to vLLM Docker deployment docs, this is the correct approach.

Would you like more detail on troubleshooting Docker authentication or vLLM multi-GPU setup?

Sources:

Hint: Mention @RunLLM in the post for followups.

To clarify, I’m unable to ever get the container up and I’m not sure what’s going on.