If I plan to deploy vLLM using Docker in a production environment with multiple concurrent users (8 users), how much RAM and how many CPU cores do I need if I’m not planning to use CPU offloading? Let’s say the model is GLM-4.5, running on 16× H200 GPUs.