Python-only build not work in docker

I use Python-only build to install vllm in editable mode after the dev stage of vllm offical dockerfile.

RUN git clone https://github.com/vllm-project/vllm.git /workspace/vllm
WORKDIR /workspace/vllm
RUN git checkout d4d309409f2396e68e4b5a67ede194913502388b
RUN VLLM_USE_PRECOMPILED=1 uv pip install --system -e . --verbose 

The docker image is built without error. However, when I try to run some unit tests in the docker container, I get these error messages.

 root@666152c8f0da:/workspace/vllm# python3 -m pytest -v -s tests/v1/core/test_async_scheduler.py
INFO 08-31 18:37:57 [__init__.py:253] Automatically detected platform cuda.
ImportError while loading conftest '/workspace/vllm/tests/conftest.py'.
tests/conftest.py:20: in <module>
    from tests.models.utils import (TokensTextLogprobs,
tests/models/utils.py:11: in <module>
    from vllm.config import ModelConfig, TaskOption
vllm/config.py:35: in <module>
    from vllm.model_executor.layers.quantization import QuantizationMethods
vllm/model_executor/__init__.py:4: in <module>
    from vllm.model_executor.parameter import (BasevLLMParameter,
vllm/model_executor/parameter.py:10: in <module>
    from vllm.distributed import get_tensor_model_parallel_rank
vllm/distributed/__init__.py:4: in <module>
    from .communication_op import *
vllm/distributed/communication_op.py:9: in <module>
    from .parallel_state import get_tp_group
vllm/distributed/parallel_state.py:150: in <module>
    from vllm.platforms import current_platform
vllm/platforms/__init__.py:285: in __getattr__
    _current_platform = resolve_obj_by_qualname(
vllm/utils/__init__.py:2510: in resolve_obj_by_qualname
    module = importlib.import_module(module_name)
vllm/platforms/cuda.py:18: in <module>
    import vllm._C  # noqa
E   ImportError: /workspace/vllm/vllm/_C.abi3.so: undefined symbol: _ZN3c104cuda9SetDeviceEab

It looks like the wheels are not installed.

Your error (ImportError: /workspace/vllm/vllm/_C.abi3.so: undefined symbol: _ZN3c104cuda9SetDeviceEab) indicates a missing or incompatible compiled library, not a missing wheel. This typically happens if you use the Python-only build (VLLM_USE_PRECOMPILED=1 uv pip install -e .) but your environment requires a full build with C++/CUDA compilation, or if the precompiled wheel is not compatible with your CUDA/PyTorch versions. The Python-only build cannot be used if you need to run C++/CUDA code, such as for unit tests or kernel development—see the warning in the docs and similar issues in the community discussions (docs, issue example).

To fix this, you must build vLLM from source with compilation enabled. Remove the VLLM_USE_PRECOMPILED=1 and run:

uv pip install -e .

or, if using pip:

pip install -e .

This will compile the necessary C++/CUDA extensions. If you need to limit memory usage during build, set MAX_JOBS=4 or similar (docs). Would you like a step-by-step breakdown?

Sources:

same problem. tried

pip install -e .

I tried Python-only and C++/CUDA compilation in a customized VLLM with cuda=12.6, torch=2.7.0

cd /vllm-internal
export SETUPTOOLS_SCM_PRETEND_VERSION_FOR_VLLM="0.9.1" #cause it's a internal version, no git tag 
pip install -e . 

If you still get the undefined symbol error after running pip install -e . (full build), it usually means there is a mismatch between your vLLM, CUDA, and PyTorch versions, or the build is not picking up the correct CUDA/PyTorch libraries. This can also happen if you have multiple PyTorch versions or conflicting CUDA libraries in your environment (issue reference, docs).

Make sure:

  1. The CUDA version used to build vLLM matches the one used to build PyTorch.
  2. You are using the same Python environment for both PyTorch and vLLM.
  3. No old vLLM .so files are left in your environment (clean up build/ and *.so files).
  4. If using a custom PyTorch, run python use_existing_torch.py before building vLLM (docs).

Would you like a step-by-step troubleshooting guide?

Sources:

fellow your guide,I solved this,thanks