Hi I found vllm engine args at Engine Arguments — vLLM along with a brief description of each args’ function.
Is there a tidy list of engine args with ramifications, consequences, side-effects and such?
Or would that be more like “You need to learn all the features to get the full picture for arg effects?” Which yeah, I totally get that and I agree.
Just wondering if there’s a “level 2 support (lol)” engine args page that can guide folks like me with 16gb vram (or some other local issue) be productive as consumers of vLLM even when we are also our own everything-involved support & engineering
Also it’s just very satisfying to figure out a full command line to get a model up and serving despite under-providing the model’s resources even if we’re not entirely sure (yet) how it worked.
EDIT: vLLM’s “Ask AI” has been very helpful in this area