askmyteapot commited on
Commit
bd9ae91
·
1 Parent(s): 17ff8ed

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -0
README.md ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ## This is a 4bit quant of https://huggingface.co/MetaIX/GPT4-X-Alpasta-30b
2
+
3
+
4
+
5
+ # My secret sauce:
6
+ * Using comit <a href="https://github.com/0cc4m/GPTQ-for-LLaMa/tree/3c16fd9c7946ebe85df8d951cb742adbc1966ec7">3c16fd9</a> of 0cc4m's GPTQ fork
7
+ * Using C4 as the calibration dataset
8
+ * Act-order, True-sequential, percdamp 0.1
9
+ (<i>the default percdamp is 0.01</i>)
10
+ * No groupsize
11
+ * Will run with CUDA, does not need triton.
12
+ * Quant completed on a 'Premium GPU' and 'High Memory' Google Colab.
13
+
14
+ ## Benchmark results
15
+
16
+ |<b>Model<b>|<b>C4<b>|<b>WikiText2<b>|<b>PTB<b>|
17
+ |:---:|---|---|---|
18
+ |MetaIX's FP16|6.98400259|4.607768536|9.414786339|
19
+ |This Quant|7.292364597|4.954069614|9.754593849|