Error

#1
by Hardcore7651 - opened

Tried on an A100 and 3 A5000s in text gen webui using runpod, both times I got this:
ImportError: cannot import name 'ExLlamaV2Cache_Q4' from 'exllamav2' (/usr/local/lib/python3.10/dist-packages/exllamav2/init.py)

Owner

Hi. This appears to be an issue with the exllamav2 version you're running: https://github.com/TheBlokeAI/dockerLLM/issues/17

Though to be fair, it's not like I was able to test above 3.5 quants on my home dual 3090 setup. I'm running through these now on a Runpod A100 80GB system to make sure they're all good. But try to update your exllamav2 with pip as per the instructions in the github issue.

I was able to run 3.0 through 5.0 quants on a Runpod A100 80GB instance using the below commands, so they should all be working fine.

cd /workspace

python -m pip install --upgrade pip

pip uninstall torch torchaudio torchvision -y

# ExllamaV2
git clone https://github.com/turboderp/exllamav2
cd exllamav2
pip install -r requirements.txt

pip install hf_transfer huggingface_hub[hf_transfer]

# Test Inference on Llama 7B
huggingface-cli download --local-dir-use-symlinks=False --revision 5.0bpw --local-dir turboderp_Llama2-7B-exl2_5.0bpw turboderp/Llama2-7B-exl2

python test_inference.py -m turboderp_Llama2-7B-exl2_5.0bpw -p "Once upon a time,"

rm -r turboderp_Llama2-7B-exl2_5.0bpw

# Download and inference on Midnight quants
## 3.0
huggingface-cli download --local-dir-use-symlinks=False --local-dir Dracones_Midnight-Miqu-103B-v1.0_exl2_3.0bpw Dracones/Midnight-Miqu-103B-v1.0_exl2_3.0bpw

python test_inference.py -m Dracones_Midnight-Miqu-103B-v1.0_exl2_3.0bpw -p "Once upon a time,"

rm -r Dracones_Midnight-Miqu-103B-v1.0_exl2_3.0bpw

## 3.5
huggingface-cli download --local-dir-use-symlinks=False --local-dir Dracones_Midnight-Miqu-103B-v1.0_exl2_3.5bpw Dracones/Midnight-Miqu-103B-v1.0_exl2_3.5bpw

python test_inference.py -m Dracones_Midnight-Miqu-103B-v1.0_exl2_3.5bpw -p "Once upon a time,"

rm -r Dracones_Midnight-Miqu-103B-v1.0_exl2_3.5bpw

## 3.75
huggingface-cli download --local-dir-use-symlinks=False --local-dir Dracones_Midnight-Miqu-103B-v1.0_exl2_3.75bpw Dracones/Midnight-Miqu-103B-v1.0_exl2_3.75bpw

python test_inference.py -m Dracones_Midnight-Miqu-103B-v1.0_exl2_3.75bpw -p "Once upon a time,"

rm -r Dracones_Midnight-Miqu-103B-v1.0_exl2_3.75bpw

## 4.0
huggingface-cli download --local-dir-use-symlinks=False --local-dir Dracones_Midnight-Miqu-103B-v1.0_exl2_4.0bpw Dracones/Midnight-Miqu-103B-v1.0_exl2_4.0bpw

python test_inference.py -m Dracones_Midnight-Miqu-103B-v1.0_exl2_4.0bpw -p "Once upon a time,"

rm -r Dracones_Midnight-Miqu-103B-v1.0_exl2_4.0bpw

## 4.25
huggingface-cli download --local-dir-use-symlinks=False --local-dir Dracones_Midnight-Miqu-103B-v1.0_exl2_4.25bpw Dracones/Midnight-Miqu-103B-v1.0_exl2_4.25bpw

python test_inference.py -m Dracones_Midnight-Miqu-103B-v1.0_exl2_4.25bpw -p "Once upon a time,"

rm -r Dracones_Midnight-Miqu-103B-v1.0_exl2_4.25bpw

## 4.5
huggingface-cli download --local-dir-use-symlinks=False --local-dir Dracones_Midnight-Miqu-103B-v1.0_exl2_4.5bpw Dracones/Midnight-Miqu-103B-v1.0_exl2_4.5bpw

python test_inference.py -m Dracones_Midnight-Miqu-103B-v1.0_exl2_4.5bpw -p "Once upon a time,"

rm -r Dracones_Midnight-Miqu-103B-v1.0_exl2_4.5bpw

## 5.0
huggingface-cli download --local-dir-use-symlinks=False --local-dir Dracones_Midnight-Miqu-103B-v1.0_exl2_5.0bpw Dracones/Midnight-Miqu-103B-v1.0_exl2_5.0bpw

python test_inference.py -m Dracones_Midnight-Miqu-103B-v1.0_exl2_5.0bpw -p "Once upon a time,"
Owner

Screenshot_20240308_124944.jpg

Dracones changed discussion status to closed

Sign up or log in to comment