Dracones
/

CodeQwen1.5-7B-Chat_exl2_4.0bpw

Text Generation

text-generation-inference

Inference Endpoints

4-bit precision

Model card Files Files and versions Community

Dracones commited on Apr 16

Commit

23155cd

•

1 Parent(s): 074e774

Upload README.md with huggingface_hub

Files changed (1) hide show

README.md +35 -0

README.md CHANGED Viewed

@@ -24,6 +24,41 @@ These quants were made with exllamav2 version 0.0.18. Quants made on this versio
 If you have problems loading these models, please update Text Generation WebUI to the latest version.
 ## Quant Details

 If you have problems loading these models, please update Text Generation WebUI to the latest version.
+## Perplexity Scoring
+Below are the perplexity scores for the EXL2 models. A lower score is better.
+_TODO:_ Coming soon
+### Perplexity Script
+This was the script used for perplexity testing.
+```bash
+#!/bin/bash
+source ~/miniconda3/etc/profile.d/conda.sh
+conda activate exllamav2
+# Set the model name and bit size
+MODEL_NAME="CodeQwen1.5-7B-Chat"
+BIT_PRECISIONS=(8.0 7.0 6.0 5.0 4.0 3.5 3.0 2.75 2.5)
+# Print the markdown table header
+echo "| Quant Level | Perplexity Score |"
+echo "|-------------|------------------|"
+for BIT_PRECISION in "${BIT_PRECISIONS[@]}"
+do
+  MODEL_DIR="models/${MODEL_NAME}_exl2_${BIT_PRECISION}bpw"
+  if [ -d "$MODEL_DIR" ]; then
+    output=$(python test_inference.py -m "$MODEL_DIR" -gs 17,24 -ed data/wikitext/wikitext-2-v1.parquet)
+    score=$(echo "$output" | grep -oP 'Evaluation perplexity: \K[\d.]+')
+    echo "| $BIT_PRECISION | $score |"
+  fi
+done
+```
 ## Quant Details