You’re absolutely right—“online” and “offline” in vLLM (and many LLM frameworks) refer to the interface used (API server vs. direct Python), not internet connectivity. This is a common source of confusion, and your realization is shared by many users. The vLLM docs now clarify this distinction.
If you have more questions or want clarification on any other vLLM concepts, feel free to ask!
Sources: