hfl
/

Chinese-Mixtral-Instruct-GGUF

Chinese Mixtral GitHub repository: https://github.com/ymcui/Chinese-Mixtral

This repository contains the GGUF-v3 models (llama.cpp compatible) for Chinese-Mixtral-Instruct (chat/instruction model).

Note: When using instruction/chat model, you MUST follow the official prompt template! Example: chat.sh

Performance

Metric: PPL, lower is better

Quant Size ↓ PPL
IQ1_S 9.8 GB 9.5782 +/- 0.08909
IQ1_M 10.8 GB 7.4666 +/- 0.06741
IQ2_XXS 12.3 GB 6.3923 +/- 0.05674
IQ2_XS 13.7 GB 6.0606 +/- 0.05834
IQ2_S 14.1 GB 4.7617 +/- 0.04177
IQ2_M 15.5 GB 4.5911 +/- 0.04054
Q2_K 17.3 GB 4.8592 +/- 0.04303
IQ3_XXS 18.3 GB 4.3557 +/- 0.03846
IQ3_XS 19.3 GB 4.3328 +/- 0.03779
IQ3_S 20.4 GB 4.3138 +/- 0.03785
IQ3_M 21.4 GB 4.3024 +/- 0.03775
Q3_K 22.5 GB 4.4334 +/- 0.03937
IQ4_XS 25.1 GB 4.2324 +/- 0.03757
Q4_0 26.4 GB 4.2688 +/- 0.03787
IQ4_NL 26.5 GB 4.2384 +/- 0.03763
Q4_K 28.4 GB 4.2433 +/- 0.03768
Q5_0 32.2 GB 4.2142 +/- 0.03733
Q5_K 33.2 GB 4.2177 +/- 0.03743
Q6_K 38.4 GB 4.2184 +/- 0.03754
Q8_0 49.6 GB 4.2053 +/- 0.03732
F16 93.5 GB x

Due to the file size limitation, for F16 model, please use cat command to concatenate all parts into a single file. You must concatenate these parts in order.

Others

For Hugging Face version, please see: https://huggingface.co/hfl/chinese-mixtral-instruct

Please refer to https://github.com/ymcui/Chinese-Mixtral/ for more details.

Citation

Please consider cite our paper if you use the resource of this repository. Paper link: https://arxiv.org/abs/2403.01851

@article{chinese-mixtral,
      title={Rethinking LLM Language Adaptation: A Case Study on Chinese Mixtral}, 
      author={Cui, Yiming and Yao, Xin},
      journal={arXiv preprint arXiv:2403.01851},
      url={https://arxiv.org/abs/2403.01851},
      year={2024}
}
Downloads last month
1,296
GGUF
Model size
46.7B params
Architecture
llama

1-bit

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference API
Unable to determine this model's library. Check the docs .

Collection including hfl/chinese-mixtral-instruct-gguf