anon8231489123 commited on
Commit
14d6f0a
1 Parent(s): 7fc3159

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +11 -1
README.md CHANGED
@@ -1,4 +1,14 @@
1
- (untested)
 
 
 
 
 
 
 
 
 
 
2
  GPTQ 4bit quantization of: https://huggingface.co/chavinlo/gpt4-x-alpaca
3
  Note: This was quantized with this branch of GPTQ-for-LLaMA: https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/triton
4
  Because of this, it appears to be incompatible with Oobabooga at the moment. Stay tuned?
 
1
+ Update: Okay... Two different models now. One generated in the Triton branch, one generated in Cuda. Use the Cuda one for now unless the Triton branch becomes widely used.
2
+ Cuda info (use this one):
3
+ Command:
4
+ CUDA_VISIBLE_DEVICES=0 python llama.py ./models/chavinlo-gpt4-x-alpaca
5
+ --wbits 4
6
+ --true-sequential
7
+ --groupsize 128
8
+ --save gpt-x-alpaca-13b-native-4bit-128g-cuda.pt
9
+
10
+
11
+ Prev. info
12
  GPTQ 4bit quantization of: https://huggingface.co/chavinlo/gpt4-x-alpaca
13
  Note: This was quantized with this branch of GPTQ-for-LLaMA: https://github.com/qwopqwop200/GPTQ-for-LLaMa/tree/triton
14
  Because of this, it appears to be incompatible with Oobabooga at the moment. Stay tuned?