askmyteapot
/

GPT4-X-Alpasta-30b-4bit

Text Generation

Inference Endpoints

Model card Files Files and versions Community

askmyteapot commited on May 17, 2023

Commit

bd9ae91

·

1 Parent(s): 17ff8ed

Create README.md

Files changed (1) hide show

README.md +19 -0

README.md ADDED Viewed

	@@ -0,0 +1,19 @@

+## This is a 4bit quant of https://huggingface.co/MetaIX/GPT4-X-Alpasta-30b
+# My secret sauce:
+  * Using comit <a href="https://github.com/0cc4m/GPTQ-for-LLaMa/tree/3c16fd9c7946ebe85df8d951cb742adbc1966ec7">3c16fd9</a> of 0cc4m's GPTQ fork
+  * Using C4 as the calibration dataset
+  * Act-order, True-sequential, percdamp 0.1
+     (<i>the default percdamp is 0.01</i>)
+  * No groupsize
+  * Will run with CUDA, does not need triton.
+  * Quant completed on a 'Premium GPU' and 'High Memory' Google Colab.
+## Benchmark results
+|<b>Model<b>|<b>C4<b>|<b>WikiText2<b>|<b>PTB<b>|
+|:---:|---|---|---|
+|MetaIX's FP16|6.98400259|4.607768536|9.414786339|
+|This Quant|7.292364597|4.954069614|9.754593849|