Chinese-Alpaca-Plus-13B-GPTQ

This is GPTQ format quantised 4bit models of Yiming Cui's Chinese-LLaMA-Alpaca 13B.

It is the result of quantising to 4bit using GPTQ-for-LLaMa.

Model Details

Model Description

Developed by: ymcui (Yiming Cui)
Shared by: Known Rabbit
Language(s) (NLP): Chinese, English
License: Apache 2.0
Finetuned from model: LLaMA

The original Github project: ymcui/Chinese-LLaMA-Alpaca: 中文LLaMA&Alpaca大语言模型+本地CPU/GPU部署 (Chinese LLaMA & Alpaca LLMs)

In order to promote the open research of large models in the Chinese NLP community, this project open sourced the Chinese LLaMA model and the Alpaca large model with fine-tuned instructions. Based on the original LLaMA, these models expand the Chinese vocabulary and use Chinese data for secondary pre-training, which further improves the basic semantic understanding of Chinese. At the same time, the Chinese Alpaca model further uses Chinese instruction data for fine-tuning, which significantly improves the model's ability to understand and execute instructions. For details, please refer to the technical report (Cui, Yang, and Yao, 2023).

Model Sources

Repository: https://github.com/ymcui/Chinese-LLaMA-Alpaca
Paper: [2304.08177] Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca

Uses

Direct Use

How to easily download and use this model in text-generation-webui

Open the text-generation-webui UI as normal.

Click the Model tab.
Under Download custom model or LoRA, enter rabitt/Chinese-Alpaca-Plus-13B-GPTQ.
Click Download.
Wait until it says it's finished downloading.
Click the Refresh icon next to Model in the top left.
In the Model drop-down: choose the model you just downloaded, Chinese-Alpaca-Plus-13B-GPTQ.
If you see an error like Error no file named pytorch_model.bin ... in the bottom right, ignore it - it's temporary.
Fill out the GPTQ parameters on the right: Bits = 4, Groupsize = 128, model_type = Llama
Click Save settings for this model in the top right.
Click Reload the Model in the top right.
Once it says it's loaded, click the Text Generation tab and enter a prompt!

Training Details

Training Procedure

Download models from the following links
- Original LLaMA: https://github.com/facebookresearch/llama/pull/73
- Chinese-LLaMA-Plus-13B
  - ziqingyang/chinese-llama-plus-lora-13b · Hugging Face
  - chinese_llama_plus_lora_13b.zip_免费高速下载|百度网盘-分享无限制
- Chinese-Alpaca-Plus-13B
  - ziqingyang/chinese-alpaca-plus-lora-13b · Hugging Face
  - chinese_alpaca_plus_lora_13b.zip_免费高速下载|百度网盘-分享无限制

Convert LLaMA to HuggingFace (HF) format with convert_llama_weights_to_hf.py

wget https://github.com/huggingface/transformers/raw/main/src/transformers/models/llama/convert_llama_weights_to_hf.py
PROTOCOL_BUFFERS_PYTHON_IMPLEMENTATION=python \
python convert_llama_weights_to_hf.py \
    --input_dir ./llama \
    --model_size 13B \
    --output_dir ./llama-13b-hf

Merge Chinese-LLaMA-Plus-13B and Chinese-Alpaca-Plus-13B into LLaMA with merge_llama_with_chinese_lora.py

wget https://github.com/ymcui/Chinese-LLaMA-Alpaca/raw/main/scripts/merge_llama_with_chinese_lora.py
python merge_llama_with_chinese_lora.py \
    --base_model ./llama-13b-hf \
    --lora_model ./Chinese-LLaMA-Plus-LoRA-13B,./Chinese-Alpaca-Plus-LoRA-13B \
    --output_type huggingface \
    --output_dir ./Chinese-Alpaca-Plus-13B

Quantise the model with GPTQ-for-LLaMa

mkdir -p Chinese-Alpaca-Plus-13B-GPTQ
git clone https://github.com/qwopqwop200/GPTQ-for-LLaMa.git
cd GPTQ-for-LLaMa
# export CUDA_VISIBLE_DEVICES=0
python llama.py ../Chinese-Alpaca-Plus-13B c4 --wbits 4 --true-sequential --act-order --groupsize 128 --save_safetensors ../Chinese-Alpaca-Plus-13B-GPTQ/Chinese-Alpaca-Plus-13B-GPTQ-4bit-128g.safetensors

Citation

BibTeX:

@article{chinese-llama-alpaca,
      title={Efficient and Effective Text Encoding for Chinese LLaMA and Alpaca}, 
      author={Cui, Yiming and Yang, Ziqing and Yao, Xin},
      journal={arXiv preprint arXiv:2304.08177},
      url={https://arxiv.org/abs/2304.08177},
      year={2023}
}