Chinese-LLaMA-2-7B-GGUF
This repository contains the GGUF-v3 models (llama.cpp compatible) for Chinese-LLaMA-2-7B.
Performance
Metric: PPL, lower is better
Quant | original | imatrix (-im ) |
---|---|---|
Q2_K | 15.1160 +/- 0.30469 | 12.7682 +/- 0.26022 |
Q3_K | 9.9588 +/- 0.20549 | 9.8508 +/- 0.20484 |
Q4_0 | 9.8085 +/- 0.20350 | - |
Q4_K | 9.5802 +/- 0.20015 | 9.6327 +/- 0.20219 |
Q5_0 | 9.4783 +/- 0.19622 | - |
Q5_K | 9.5132 +/- 0.19989 | 9.4447 +/- 0.19772 |
Q6_K | 9.4640 +/- 0.19909 | 9.4507 +/- 0.19849 |
Q8_0 | 9.4659 +/- 0.19927 | - |
F16 | 9.4627 +/- 0.19921 | - |
The model with -im
suffix is generated with important matrix, which has generally better performance (not always though).
Others
For Hugging Face version, please see: https://huggingface.co/hfl/chinese-llama-2-7b
Please refer to https://github.com/ymcui/Chinese-LLaMA-Alpaca-2/ for more details.
- Downloads last month
- 1,257
This model does not have enough activity to be deployed to Inference API (serverless) yet.
Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.