|
--- |
|
license: cc-by-sa-4.0 |
|
datasets: |
|
- mc4 |
|
- cc100 |
|
- oscar |
|
- togethercomputer/RedPajama-Data-1T |
|
language: |
|
- ja |
|
- en |
|
base_model: meta-llama/Llama-2-70b-hf |
|
pipeline_tag: text-generation |
|
tags: |
|
- llama |
|
- llama-2 |
|
--- |
|
|
|
This repository contains the exl2 quants of [karakuri-ai/karakuri-lm-70b-v0.1](https://huggingface.co/karakuri-ai/karakuri-lm-70b-v0.1), calibrated using [the default dataset built by exllamav2](https://github.com/turboderp/exllamav2/blob/3b0f5230e9619fc0ddf1f302785049d10a65fe26/conversion/tokenize.py#L55). |
|
|
|
Compatible with [exllamav2](https://github.com/turboderp/exllamav2) version 0.0.11 and later. For optimal model loading, it is recommended to use [tabbyAPI](https://github.com/theroyallab/tabbyAPI). |
|
|
|
The measurement file is attached in the branch `main` and all quants are stored in their respective branches. |
|
|
|
The chart below presents the calibration perplexity and [wikitext-103-v1](https://huggingface.co/datasets/wikitext) test perplexity for all provided quants. |
|
|
|
| Quant | Calibration Perplexity | wikitext-103-v1 Test Perplexity | |
|
|-----------------------------------------------------------------------------------------|------------------------|---------------------------------| |
|
| [2.4bpw-h6](https://huggingface.co/MatrixC7/karakuri-lm-70b-v0.1-exl2/tree/2.4bpw-h6) | 8.4726 | 7.1337 | |
|
| ~2.65bpw-h6~* | ~8.1743~ | ~8.7475~ | |
|
| [2.65bpw-h6](https://huggingface.co/MatrixC7/karakuri-lm-70b-v0.1-exl2/tree/2.65bpw-h6) | 8.0901 | 6.5724 | |
|
| [3.0bpw-h6](https://huggingface.co/MatrixC7/karakuri-lm-70b-v0.1-exl2/tree/3.0bpw-h6) | 7.9927 | 6.4607 | |
|
| [4.0bpw-h6](https://huggingface.co/MatrixC7/karakuri-lm-70b-v0.1-exl2/tree/4.0bpw-h6) | 7.6440 | 5.8014 | |
|
| [4.65bpw-h6](https://huggingface.co/MatrixC7/karakuri-lm-70b-v0.1-exl2/tree/4.65bpw-h6) | 7.5872 | 5.7112 | |
|
| [5.0bpw-h6](https://huggingface.co/MatrixC7/karakuri-lm-70b-v0.1-exl2/tree/5.0bpw-h6) | 7.5745 | | |
|
| [6.0bpw-h6](https://huggingface.co/MatrixC7/karakuri-lm-70b-v0.1-exl2/tree/6.0bpw-h6) | 7.5616 | | |
|
| [8.0bpw-h8](https://huggingface.co/MatrixC7/karakuri-lm-70b-v0.1-exl2/tree/8.0bpw-h8) | 7.5604 | | |
|
|
|
*: This first 2.65bpw-h6 quantized version has been deprecated due to unexpectedly high wikitext-103-v1 Test Perplexity values. |