`computeCapability not supported` error with LoRA adapter on 5090

btakeya · June 10, 2025, 12:47am

Hi forum,
With vllm 0.9.0 w/ some fixes (issue #14452–sorry, I can’t attach links more than 2), I could run it with my internal w4a16-quantized model. But another issue blocks me to get there–when trying with adapter for this model, I’ve faced this error below:

python3: /project/lib/Dialect/TritonGPU/Transforms/AccelerateMatmul.cpp:42: int mlir::triton::gpu::(anonymous namespace)::getMMAVersionSafe(int, DotOp): Assertion `false && "computeCapability not supported"' failed.

Does anybody have same issue with mine? I believe it’s issue of triton lang–I’ve found some fixes applying on recent releases after pytorch_triton-3.3.0 (and vllm 0.9.0 uses it) and seems this PR applied on 3.3.1. Thus, apparently to support blackwell more, updating this would be required. It would be pleased if you could give any comments on it. (and now I’m giving it a try… )

RunLLM · June 10, 2025, 12:47am

This error (getMMAVersionSafe(int, DotOp): Assertion 'false && "computeCapability not supported"' failed) is a known Triton kernel issue when running on new GPU architectures (e.g., Blackwell/RTX 5090) with vLLM and certain quantized models or adapters. The root cause is that the Triton version bundled with vLLM 0.9.0 (pytorch_triton-3.3.0) does not fully support the latest compute capabilities, and the relevant fix is included in Triton 3.3.1 and later. Upgrading Triton to a newer version (3.3.1+) should resolve this assertion error, as confirmed in vLLM issue #17569 and related discussions.

Would you like more detail on how to upgrade Triton or work around this issue?

Sources:

vLLM issue #17569

Topic		Replies	Views
LoRA Adapter enabling with vLLM is not working LoRA	4	584	April 21, 2025
RuntimeError: Int8 not supported on SM120. Use FP8 quantization instead, or run on older arch (SM < 100) NVIDIA GPU Support	1	239	November 19, 2025
Need help compiling and running on Jetson Thor NVIDIA GPU Support	4	884	November 1, 2025
Docker image `vllm/vllm-openai:v0.9.0` doesn't work on 5090 General	3	938	June 10, 2025
vLLM on RTX5090: Working GPU setup with torch 2.9.0 cu128 NVIDIA GPU Support	18	6489	January 13, 2026

`computeCapability not supported` error with LoRA adapter on 5090

Related topics