hfl
/

Chinese-Alpaca-2-13B-16K-GGUF

This repository contains the GGUF-v3 models (llama.cpp compatible) for Chinese-Alpaca-2-13B-16K.

Performance

Metric: PPL, lower is better

Quant original imatrix (-im)
Q2_K 12.7790 +/- 0.17943 13.8057 +/- 0.19614
Q3_K 10.0834 +/- 0.14063 9.6355 +/- 0.13483
Q4_0 9.7072 +/- 0.13563 -
Q4_K 9.2864 +/- 0.13001 9.2097 +/- 0.12874
Q5_0 9.2062 +/- 0.12846 -
Q5_K 9.0912 +/- 0.12705 9.0701 +/- 0.12668
Q6_K 9.0799 +/- 0.12681 9.0558 +/- 0.12653
Q8_0 9.0200 +/- 0.12616 -
F16 9.0142 +/- 0.12603 -

The model with -im suffix is generated with important matrix, which has generally better performance (not always though).

Others

For Hugging Face version, please see: https://huggingface.co/hfl/chinese-alpaca-2-13b-16k

Please refer to https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/ for more details.

Downloads last month
113
GGUF
Model size
13.3B params
Architecture
llama

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

16-bit

Inference API
Unable to determine this model's library. Check the docs .