Ok I paste one sample here:
I tried to build a 4bit quantized model from the small original model 'Apertus8b`. I used the files published at HuggingFace from ‘swiss-ai/Apertus-8B-Instruct-2509’. And I used the following code to build a vLLM image I can run on my RockyLinux machine:
#Dockerfile.build_awq_container
FROM docker.io/library/python:3.10-slim
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/*
# Install AutoAWQ and dependencies
RUN pip install --upgrade pip \
&& pip install autoawq torch transformers accelerate huggingface_hub
WORKDIR /app
COPY build_autoawq_model.py /app/build_model.py
CMD ["python", "/app/build_model.py"]
Then the python file to build:
# build_autoawq_model.py
import os
from awq import AutoAWQForCausalLM
from transformers import AutoTokenizer
model_id = os.environ["MODEL_ID"]
output_dir = os.environ["OUTPUT_DIR"]
print(f"Downloading and quantizing model: {model_id}", flush=True)
# Load and quantize the model
model = AutoAWQForCausalLM.from_pretrained(model_id)
model.quantize(wbits=4, group_size=128)
# Save quantized model
model.save_quantized(output_dir)
# Save tokenizer and config
tokenizer = AutoTokenizer.from_pretrained(model_id)
tokenizer.save_pretrained(output_dir)
print(f"✅ Model quantized and saved to {output_dir}", flush=True)
And the shell scipt that starts it:
# build_apertus8b_awq_image.sh
#!/usr/bin/env bash
MODEL_ID="swiss-ai/Apertus-8B-Instruct-2509"
MODEL_NAME="apertus8b"
BUILD_CONTAINER_NAME="autoawq-builder-${MODEL_NAME}"
BUILD_IMAGE_NAME="autoawq-builder:${MODEL_NAME}"
BUILD_LOG_DIR="buildlogs"
OUTPUT_DIR="${MODEL_NAME}_AWQ_CACHE"
mkdir -p "$BUILD_LOG_DIR"
mkdir -p "$OUTPUT_DIR"
NOW=$(date '+%Y%m%d-%H%M%S')
LOG_FILE="$BUILD_LOG_DIR/build_${MODEL_NAME}_autoawq_$NOW.log"
# Build the image
podman build -f Dockerfile.build_awq_container -t "$BUILD_IMAGE_NAME" .
# Run the container
stdbuf -oL -eL podman run \
--rm \
--name "$BUILD_CONTAINER_NAME" \
--volume "$(pwd)/$OUTPUT_DIR:/output:Z" \
--env MODEL_ID="$MODEL_ID" \
--env OUTPUT_DIR="/output" \
"$BUILD_IMAGE_NAME" \
2>&1 | tee "$LOG_FILE"
I got as a result:
Successfully tagged localhost/autoawq-builder:apertus8b
0b418bbdbbdea06b85ca491094091f55ff6a25be8330218785342d96131986af
/usr/local/lib/python3.10/site-packages/awq/__init__.py:21: DeprecationWarning:
I have left this message as the final dev message to help you transition.
Important Notice:
- AutoAWQ is officially deprecated and will no longer be maintained.
- The last tested configuration used Torch 2.6.0 and Transformers 4.51.3.
- If future versions of Transformers break AutoAWQ compatibility, please report the issue to the Transformers project.
Alternative:
- AutoAWQ has been adopted by the vLLM Project: https://github.com/vllm-project/llm-compressor
For further inquiries, feel free to reach out:
- X: https://x.com/casper_hansen_
- LinkedIn: https://www.linkedin.com/in/casper-hansen-804005170/
warnings.warn(_FINAL_DEV_MESSAGE, category=DeprecationWarning, stacklevel=1)
Traceback (most recent call last):
File "/app/build_model.py", line 2, in <module>
from awq import AutoAWQForCausalLM
File "/usr/local/lib/python3.10/site-packages/awq/__init__.py", line 24, in <module>
from awq.models.auto import AutoAWQForCausalLM
File "/usr/local/lib/python3.10/site-packages/awq/models/__init__.py", line 1, in <module>
from .mpt import MptAWQForCausalLM
File "/usr/local/lib/python3.10/site-packages/awq/models/mpt.py", line 1, in <module>
from .base import BaseAWQForCausalLM
File "/usr/local/lib/python3.10/site-packages/awq/models/base.py", line 49, in <module>
from awq.quantize.quantizer import AwqQuantizer
File "/usr/local/lib/python3.10/site-packages/awq/quantize/quantizer.py", line 11, in <module>
from awq.quantize.scale import apply_scale, apply_clip
File "/usr/local/lib/python3.10/site-packages/awq/quantize/scale.py", line 12, in <module>
from transformers.activations import NewGELUActivation, PytorchGELUTanh, GELUActivation
ImportError: cannot import name 'PytorchGELUTanh' from 'transformers.activations' (/usr/local/lib/python3.10/site-packages/transformers/activations.py)
So I tried exactly the same thing with LLMCompressor. I tried several things, but nothing works. The errors where always of the same type:
- library not found
- Naming conflicts
- Version conflicts