iproskurina
/

opt-350m-GPTQ-4bit-g128

Text Generation

4-bit precision

text-generation-inference

Model card Files Files and versions Community

iproskurina commited on Sep 23

Commit

3070dff

•

1 Parent(s): f713e7b

Update README.md

Files changed (1) hide show

README.md +16 -4

README.md CHANGED Viewed

@@ -38,13 +38,25 @@ The grouping size used for quantization is equal to 128.
 ### Install the necessary packages
 ```shell
-pip install accelerate==0.26.1 datasets==2.16.1 dill==0.3.7 gekko==1.0.6 multiprocess==0.70.15 peft==0.7.1 rouge==1.0.1 sentencepiece==0.1.99
-git clone https://github.com/upunaprosk/AutoGPTQ
 cd AutoGPTQ
-pip install -v .
 ```
-Recommended transformers version: 4.35.2.
 ### You can then use the following code

 ### Install the necessary packages
+Requires: Transformers 4.33.0 or later, Optimum 1.12.0 or later, and AutoGPTQ 0.4.2 or later.
+```shell
+pip3 install --upgrade transformers optimum
+# If using PyTorch 2.1 + CUDA 12.x:
+pip3 install --upgrade auto-gptq
+# or, if using PyTorch 2.1 + CUDA 11.x:
+pip3 install --upgrade auto-gptq --extra-index-url https://huggingface.github.io/autogptq-index/whl/cu118/
+```
+If you are using PyTorch 2.0, you will need to install AutoGPTQ from source. Likewise if you have problems with the pre-built wheels, you should try building from source:
 ```shell
+pip3 uninstall -y auto-gptq
+git clone https://github.com/PanQiWei/AutoGPTQ
 cd AutoGPTQ
+git checkout v0.5.1
+pip3 install .
 ```
 ### You can then use the following code