Baichuan-13B-Instruction-GGML / README.md

Update README.md

f0e22c1 12 months ago

8.42 kB

	---
	license: openrail
	language:
	- en
	tags:
	- text-generation-inference
	pipeline_tag: text-generation
	library_name: transformers
	---


	## Original model card

	Buy me a coffee if you like this project ;)
	<a href="https://www.buymeacoffee.com/s3nh"><img src="https://www.buymeacoffee.com/assets/img/guidelines/download-assets-sm-1.svg" alt=""></a>

	#### Description

	GGML Format model files for [This project](https://huggingface.co/AlpachinoNLP/Baichuan-13B-Instruction/).


	### inference


	```python

	import ctransformers

	from ctransformers import AutoModelForCausalLM

	model = AutoModelForCausalLM.from_pretrained(output_dir, ggml_file,
	gpu_layers=32, model_type="llama")

	manual_input: str = "Tell me about your last dream, please."


	llm(manual_input,
	max_new_tokens=256,
	temperature=0.9,
	top_p= 0.7)

	```



	# Original model card




	## 使用方式

	如下是一个使用Baichuan-13B-Chat进行对话的示例，正确输出为"乔戈里峰。世界第二高峰———乔戈里峰西方登山者称其为k2峰，海拔高度是8611米，位于喀喇昆仑山脉的中巴边境上"
	```python
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer
	from transformers.generation.utils import GenerationConfig
	tokenizer = AutoTokenizer.from_pretrained("AlpachinoNLP/Baichuan-13B-Instruction", use_fast=False, trust_remote_code=True)
	model = AutoModelForCausalLM.from_pretrained("AlpachinoNLP/Baichuan-13B-Instruction", device_map="auto", torch_dtype=torch.float16, trust_remote_code=True)
	model.generation_config = GenerationConfig.from_pretrained("AlpachinoNLP/Baichuan-13B-Instruction")
	messages = []
	messages.append({"role": "Human", "content": "世界上第二高的山峰是哪座"})
	response = model.chat(tokenizer, messages)
	print(response)
	```

	## 量化部署

	Baichuan-13B 支持 int8 和 int4 量化，用户只需在推理代码中简单修改两行即可实现。请注意，如果是为了节省显存而进行量化，应加载原始精度模型到 CPU 后再开始量化；避免在 `from_pretrained` 时添加 `device_map='auto'` 或者其它会导致把原始精度模型直接加载到 GPU 的行为的参数。

	使用 int8 量化 (To use int8 quantization):
	```python
	model = AutoModelForCausalLM.from_pretrained("AlpachinoNLP/Baichuan-13B-Instruction", torch_dtype=torch.float16, trust_remote_code=True)
	model = model.quantize(8).cuda()
	```

	同样的，如需使用 int4 量化 (Similarly, to use int4 quantization):
	```python
	model = AutoModelForCausalLM.from_pretrained("AlpachinoNLP/Baichuan-13B-Instruction", torch_dtype=torch.float16, trust_remote_code=True)
	model = model.quantize(4).cuda()
	```

	## 模型详情


	### 模型结构

	<!-- Provide the basic links for the model. -->

	整体模型基于Baichuan-13B，为了获得更好的推理性能，Baichuan-13B 使用了 ALiBi 线性偏置技术，相对于 Rotary Embedding 计算量更小，对推理性能有显著提升；与标准的 LLaMA-13B 相比，生成 2000 个 tokens 的平均推理速度 (tokens/s)，实测提升 31.6%：

	\| Model \| tokens/s \|
	\| ------------ \| -------- \|
	\| LLaMA-13B \| 19.4 \|
	\| Baichuan-13B \| 25.4 \|

	具体参数和见下表
	\| 模型名称 \| 隐含层维度 \| 层数 \| 头数 \| 词表大小 \| 总参数量 \| 训练数据（tokens） \| 位置编码 \| 最大长度 \|
	\| ------------ \| ---------- \| ---- \| ---- \| -------- \| -------------- \| ------------------ \| ----------------------------------------- \| -------- \|
	\| Baichuan-7B \| 4,096 \| 32 \| 32 \| 64,000 \| 7,000,559,616 \| 1.2万亿 \| [RoPE](https://arxiv.org/abs/2104.09864) \| 4,096 \|
	\| Baichuan-13B \| 5,120 \| 40 \| 40 \| 64,000 \| 13,264,901,120 \| 1.4万亿 \| [ALiBi](https://arxiv.org/abs/2108.12409) \| 4,096 \|

	## 训练详情

	数据集主要由三部分组成：

	* 在 [sharegpt_zh](https://huggingface.co/datasets/QingyiSi/Alpaca-CoT/tree/main/ShareGPT) 数据集中筛选的出 13k 高质量数据。
	* [lima](https://huggingface.co/datasets/GAIR/lima)
	* 按照任务类型挑选的 2.3k 高质量中文数据集，每个任务类型的数据量在 100 条左右。

	硬件：8*A40

	## 测评结果

	## [CMMLU](https://github.com/haonan-li/CMMLU)

	\| Model 5-shot \| STEM \| Humanities \| Social Sciences \| Others \| China Specific \| Average \|
	\| ---------------------------------------------------------- \| :-------: \| :--------: \| :-------------: \| :------: \| :------------: \| :------: \|
	\| Baichuan-7B \| 34.4 \| 47.5 \| 47.6 \| 46.6 \| 44.3 \| 44.0 \|
	\| Vicuna-13B \| 31.8 \| 36.2 \| 37.6 \| 39.5 \| 34.3 \| 36.3 \|
	\| Chinese-Alpaca-Plus-13B \| 29.8 \| 33.4 \| 33.2 \| 37.9 \| 32.1 \| 33.4 \|
	\| Chinese-LLaMA-Plus-13B \| 28.1 \| 33.1 \| 35.4 \| 35.1 \| 33.5 \| 33.0 \|
	\| Ziya-LLaMA-13B-Pretrain \| 29.0 \| 30.7 \| 33.8 \| 34.4 \| 31.9 \| 32.1 \|
	\| LLaMA-13B \| 29.2 \| 30.8 \| 31.6 \| 33.0 \| 30.5 \| 31.2 \|
	\| moss-moon-003-base (16B) \| 27.2 \| 30.4 \| 28.8 \| 32.6 \| 28.7 \| 29.6 \|
	\| Baichuan-13B-Base \| 41.7 \| 61.1 \| 59.8 \| 59.0 \| 56.4 \| 55.3 \|
	\| Baichuan-13B-Chat \| 42.8 \| 62.6 \| 59.7 \| 59.0 \| 56.1 \| 55.8 \|
	\| Baichuan-13B-Instruction \| 44.50 \| 61.16 \| 59.07 \| 58.34 \| 55.55 \| 55.61 \|

	\| Model zero-shot \| STEM \| Humanities \| Social Sciences \| Others \| China Specific \| Average \|
	\| ------------------------------------------------------------ \| :-------: \| :--------: \| :-------------: \| :-------: \| :------------: \| :-------: \|
	\| [ChatGLM2-6B](https://huggingface.co/THUDM/chatglm2-6b) \| 41.28 \| 52.85 \| 53.37 \| 52.24 \| 50.58 \| 49.95 \|
	\| [Baichuan-7B](https://github.com/baichuan-inc/baichuan-7B) \| 32.79 \| 44.43 \| 46.78 \| 44.79 \| 43.11 \| 42.33 \|
	\| [ChatGLM-6B](https://github.com/THUDM/GLM-130B) \| 32.22 \| 42.91 \| 44.81 \| 42.60 \| 41.93 \| 40.79 \|
	\| [BatGPT-15B](https://arxiv.org/abs/2307.00360) \| 33.72 \| 36.53 \| 38.07 \| 46.94 \| 38.32 \| 38.51 \|
	\| [Chinese-LLaMA-13B](https://github.com/ymcui/Chinese-LLaMA-Alpaca) \| 26.76 \| 26.57 \| 27.42 \| 28.33 \| 26.73 \| 27.34 \|
	\| [MOSS-SFT-16B](https://github.com/OpenLMLab/MOSS) \| 25.68 \| 26.35 \| 27.21 \| 27.92 \| 26.70 \| 26.88 \|
	\| [Chinese-GLM-10B](https://github.com/THUDM/GLM) \| 25.57 \| 25.01 \| 26.33 \| 25.94 \| 25.81 \| 25.80 \|
	\| [Baichuan-13B](https://github.com/baichuan-inc/Baichuan-13B) \| 42.04 \| 60.49 \| 59.55 \| 56.60 \| 55.72 \| 54.63 \|
	\| [Baichuan-13B-Chat](https://github.com/baichuan-inc/Baichuan-13B) \| 37.32 \| 56.24 \| 54.79 \| 54.07 \| 52.23 \| 50.48 \|
	\| Baichuan-13B-Instruction \| 42.56 \| 62.09 \| 60.41 \| 58.97 \| 56.95 \| 55.88 \|

	> 说明：CMMLU 是一个综合性的中文评估基准，专门用于评估语言模型在中文语境下的知识和推理能力。我们直接使用其官方的[评测脚本](https://github.com/haonan-li/CMMLU)对模型进行评测。Model zero-shot 表格中 [Baichuan-13B-Chat](https://github.com/baichuan-inc/Baichuan-13B) 的得分来自我们直接运行 CMMLU 官方的评测脚本得到，其他模型的的得分来自于 [CMMLU](https://github.com/haonan-li/CMMLU/tree/master) 官方的评测结果，Model 5-shot 中其他模型的得分来自于[Baichuan-13B](https://github.com/baichuan-inc/Baichuan-13B) 官方的评测结果。