sardukar commited on
Commit
84094fd
1 Parent(s): b4f84b3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -4,6 +4,16 @@ metrics: null
4
 
5
  Quantized Meta AI's [LLaMA](https://arxiv.org/abs/2302.13971) in 4bit with the help of [GPTQ](https://arxiv.org/abs/2210.17323v2) algorithm v2.
6
 
 
 
 
 
 
 
 
 
 
 
7
  GPTQ implementation - https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/841feedde876785bc8022ca48fd9c3ff626587e2
8
 
9
  **Note:** This model will fail to load with current GPTQ-for-LLaMa implementation
@@ -11,4 +21,4 @@ GPTQ implementation - https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/841feed
11
  Conversion process
12
  ```sh
13
  CUDA_VISIBLE_DEVICES=0 python llama.py ./llama-13b c4 --wbits 4 --true-sequential --act-order --save_safetensors ./q4/llama13b-4bit-v2.safetensors
14
- ```
 
4
 
5
  Quantized Meta AI's [LLaMA](https://arxiv.org/abs/2302.13971) in 4bit with the help of [GPTQ](https://arxiv.org/abs/2210.17323v2) algorithm v2.
6
 
7
+ - [**llama13b-4bit-ts-ao-g128-v2.safetensors**](https://huggingface.co/sardukar/llama13b-4bit-v2/blob/main/llama13b-4bit-ts-ao-g128-v2.safetensors)
8
+ GPTQ implementation - https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/49efe0b67db4b40eac2ae963819ebc055da64074
9
+
10
+ Conversion process:
11
+ ```sh
12
+ CUDA_VISIBLE_DEVICES=0 python llama.py ./llama-13b c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors ./q4/llama13b-4bit-ts-ao-g128-v2.safetensors
13
+ ```
14
+
15
+
16
+ - [llama13b-4bit-v2.safetensors](https://huggingface.co/sardukar/llama13b-4bit-v2/blob/main/llama13b-4bit-v2.safetensors)
17
  GPTQ implementation - https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/841feedde876785bc8022ca48fd9c3ff626587e2
18
 
19
  **Note:** This model will fail to load with current GPTQ-for-LLaMa implementation
 
21
  Conversion process
22
  ```sh
23
  CUDA_VISIBLE_DEVICES=0 python llama.py ./llama-13b c4 --wbits 4 --true-sequential --act-order --save_safetensors ./q4/llama13b-4bit-v2.safetensors
24
+ ```