Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
metrics: null
|
3 |
+
---
|
4 |
+
|
5 |
+
Quantized Meta AI's [LLaMA](https://arxiv.org/abs/2302.13971) in 4bit with the help of [GPTQ](https://arxiv.org/abs/2210.17323v2) algorithm v2.
|
6 |
+
GPTQ implementation - https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/49efe0b67db4b40eac2ae963819ebc055da64074
|
7 |
+
|
8 |
+
Conversion process
|
9 |
+
```sh
|
10 |
+
CUDA_VISIBLE_DEVICES=0 python llama.py ./llama-7b c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors ./q4/llama7b-4bit-ts-ao-g128-v2.safetensors
|
11 |
+
```
|