hfl
/

Chinese-Alpaca-2-7B-64K

This repository contains GGUF-v3 version (llama.cpp compatible) of Chinese-Alpaca-2-7B-64K, which is tuned on Chinese-Alpaca-2-7B with YaRN method.

Performance

Metric: PPL, lower is better

Quant original imatrix (-im)
Q2_K 9.8201 +/- 0.13298 10.3057 +/- 0.14197
Q3_K 8.4435 +/- 0.11467 8.3556 +/- 0.11316
Q4_0 8.3573 +/- 0.11496 -
Q4_K 8.0558 +/- 0.10948 8.0557 +/- 0.10964
Q5_0 8.0220 +/- 0.10954 -
Q5_K 7.9388 +/- 0.10802 7.9440 +/- 0.10815
Q6_K 7.9267 +/- 0.10792 7.9126 +/- 0.10775
Q8_0 7.9117 +/- 0.10773 -
F16 7.9124 +/- 0.10780 -

The model with -im suffix is generated with important matrix, which has generally better performance (not always though).

Others

For full model in HuggingFace format, please see: https://huggingface.co/hfl/chinese-alpaca-2-7b-64k

Please refer to https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/ for more details.

Downloads last month
840
GGUF
Model size
6.93B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference API
Unable to determine this model's library. Check the docs .