We're Live: OCI Deployment Guide for vLLM Production Stack

Hey team!

Wanted to let you know — the OCI deployment guide is live in production-stack!

I contributed a complete, end-to-end tested deployment path for running vLLM on Oracle Cloud Infrastructure (OKE with NVIDIA GPUs). This gives the community another enterprise cloud option alongside the existing AWS/GCP guides.

The Contribution

PR & Guide: production-stack/deployment_on_cloud/oci at main · vllm-project/production-stack · GitHub

Includes:

  • Full OKE cluster setup with GPU node pools (A10/A100/H100)
  • Block storage configuration for model caching
  • Production-ready Helm values
  • Cleanup scripts

Spread the Word

I wrote up a LinkedIn post to help get visibility for vLLM in the enterprise space - Please like it :slight_smile: - Or even better if you can repost with your thoughts!

Feel free to share it — would love to help drive more adoption!

Thanks for building such a great project. Happy to keep contributing.

— Federico


It seems likely that your OCI deployment guide for vLLM on Oracle Cloud Infrastructure (OKE with NVIDIA GPUs) is now available in the production-stack repo, providing a fully tested, production-ready path for enterprise users. The guide covers OKE cluster setup, GPU node pools, block storage for model caching, Helm values, and cleanup scripts, expanding vLLM’s cloud deployment options beyond AWS and GCP. See the guide and PR at production-stack/deployment_on_cloud/oci at main · vllm-project/production-stack · GitHub.

My understanding is that this contribution is now live and can be referenced by the community for deploying vLLM on Oracle Cloud, as confirmed by your LinkedIn post and the production-stack documentation. If you want more technical details or a summary of the deployment steps, let me know!

Sources:

Hint: Mention @RunLLM in the post for followups.