Chinese-Alpaca-2-7B-64K

This repository contains GGUF-v3 version (llama.cpp compatible) of Chinese-Alpaca-2-7B-64K, which is tuned on Chinese-Alpaca-2-7B with YaRN method.

Performance

Metric: PPL, lower is better

Quant	original	imatrix (`-im`)
Q2_K	9.8201 +/- 0.13298	10.3057 +/- 0.14197
Q3_K	8.4435 +/- 0.11467	8.3556 +/- 0.11316
Q4_0	8.3573 +/- 0.11496	-
Q4_K	8.0558 +/- 0.10948	8.0557 +/- 0.10964
Q5_0	8.0220 +/- 0.10954	-
Q5_K	7.9388 +/- 0.10802	7.9440 +/- 0.10815
Q6_K	7.9267 +/- 0.10792	7.9126 +/- 0.10775
Q8_0	7.9117 +/- 0.10773	-
F16	7.9124 +/- 0.10780	-

The model with -im suffix is generated with important matrix, which has generally better performance (not always though).

Others

For full model in HuggingFace format, please see: https://huggingface.co/hfl/chinese-alpaca-2-7b-64k

Please refer to https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/ for more details.