sardukar
/

llama7b-4bit-v2

Text Generation

Inference Endpoints

Model card Files Files and versions Community

sardukar commited on Apr 6, 2023

Commit

c024d47

•

1 Parent(s): 8063d28

Create README.md

Files changed (1) hide show

README.md +11 -0

README.md ADDED Viewed

	@@ -0,0 +1,11 @@

+---
+metrics: null
+---
+Quantized Meta AI's [LLaMA](https://arxiv.org/abs/2302.13971) in 4bit with the help of [GPTQ](https://arxiv.org/abs/2210.17323v2) algorithm v2.
+GPTQ implementation - https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/49efe0b67db4b40eac2ae963819ebc055da64074
+Conversion process
+```sh
+CUDA_VISIBLE_DEVICES=0 python llama.py ./llama-7b c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors ./q4/llama7b-4bit-ts-ao-g128-v2.safetensors
+```