AlpachinoNLP
/

Baichuan-13B-Instruction

@@ -4,24 +4,20 @@ language:
 - en
 pipeline_tag: text-generation
 inference: false
 ---
 # Baichuan-13B-Instruction
-![](https://ai-studio-static-online.cdn.bcebos.com/3582d0f23d814b68ae429f2204de44555150da8691844e34aad80275671756e5)
 <!-- Provide a quick summary of what the model is/does. -->
 ## 介绍
 Baichuan-13B-Instruction 为 Baichuan-13B 系列模型进行指令微调后的版本，预训练模型可见 [Baichuan-13B-Base](https://huggingface.co/baichuan-inc/Baichuan-13B-Base)。
 ## 使用方式
 如下是一个使用Baichuan-13B-Chat进行对话的示例，正确输出为"乔戈里峰。世界第二高峰———乔戈里峰西方登山者称其为k2峰，海拔高度是8611米，位于喀喇昆仑山脉的中巴边境上"
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
@@ -40,14 +36,12 @@ print(response)
 Baichuan-13B 支持 int8 和 int4 量化，用户只需在推理代码中简单修改两行即可实现。请注意，如果是为了节省显存而进行量化，应加载原始精度模型到 CPU 后再开始量化；避免在 `from_pretrained` 时添加 `device_map='auto'` 或者其它会导致把原始精度模型直接加载到 GPU 的行为的参数。
 使用 int8 量化 (To use int8 quantization):
 ```python
 model = AutoModelForCausalLM.from_pretrained("AlpachinoNLP/Baichuan-13B-Instruction", torch_dtype=torch.float16, trust_remote_code=True)
 model = model.quantize(8).cuda()
 ```
 同样的，如需使用 int4 量化 (Similarly, to use int4 quantization):
 ```python
 model = AutoModelForCausalLM.from_pretrained("AlpachinoNLP/Baichuan-13B-Instruction", torch_dtype=torch.float16, trust_remote_code=True)
 model = model.quantize(4).cuda()
@@ -68,7 +62,6 @@ model = model.quantize(4).cuda()
 | Baichuan-13B | 25.4     |
 具体参数和见下表
 | 模型名称     | 隐含层维度 | 层数 | 头数 | 词表大小 | 总参数量       | 训练数据（tokens） | 位置编码                                  | 最大长度 |
 | ------------ | ---------- | ---- | ---- | -------- | -------------- | ------------------ | ----------------------------------------- | -------- |
 | Baichuan-7B  | 4,096      | 32   | 32   | 64,000   | 7,000,559,616  | 1.2万亿            | [RoPE](https://arxiv.org/abs/2104.09864)  | 4,096    |
@@ -88,18 +81,18 @@ model = model.quantize(4).cuda()
 ## [CMMLU](https://github.com/haonan-li/CMMLU)
-| Model 5-shot                 |   STEM    | Humanities | Social Sciences |  Others  | China Specific | Average  |
-| ---------------------------- | :-------: | :--------: | :-------------: | :------: | :------------: | :------: |
-| Baichuan-7B                  |   34.4    |    47.5    |      47.6       |   46.6   |      44.3      |   44.0   |
-| Vicuna-13B                   |   31.8    |    36.2    |      37.6       |   39.5   |      34.3      |   36.3   |
-| Chinese-Alpaca-Plus-13B      |   29.8    |    33.4    |      33.2       |   37.9   |      32.1      |   33.4   |
-| Chinese-LLaMA-Plus-13B       |   28.1    |    33.1    |      35.4       |   35.1   |      33.5      |   33.0   |
-| Ziya-LLaMA-13B-Pretrain      |   29.0    |    30.7    |      33.8       |   34.4   |      31.9      |   32.1   |
-| LLaMA-13B                    |   29.2    |    30.8    |      31.6       |   33.0   |      30.5      |   31.2   |
-| moss-moon-003-base (16B)     |   27.2    |    30.4    |      28.8       |   32.6   |      28.7      |   29.6   |
-| Baichuan-13B-Base            |   41.7    |    61.1    |      59.8       |   59.0   |      56.4      |   55.3   |
-| Baichuan-13B-Chat            |   42.8    |  **62.6**  |    **59.7**     | **59.0** |    **56.1**    | **55.8** |
-| **Baichuan-13B-Instruction** | **44.50** |   61.16    |      59.07      |  58.34   |     55.55      |  55.61   |
 | Model zero-shot                                              |   STEM    | Humanities | Social Sciences |  Others   | China Specific |  Average  |
 | ------------------------------------------------------------ | :-------: | :--------: | :-------------: | :-------: | :------------: | :-------: |

 - en
 pipeline_tag: text-generation
 inference: false
 ---
 # Baichuan-13B-Instruction
+![](./alpachino.png)
 <!-- Provide a quick summary of what the model is/does. -->
 ## 介绍
 Baichuan-13B-Instruction 为 Baichuan-13B 系列模型进行指令微调后的版本，预训练模型可见 [Baichuan-13B-Base](https://huggingface.co/baichuan-inc/Baichuan-13B-Base)。
 ## 使用方式
 如下是一个使用Baichuan-13B-Chat进行对话的示例，正确输出为"乔戈里峰。世界第二高峰———乔戈里峰西方登山者称其为k2峰，海拔高度是8611米，位于喀喇昆仑山脉的中巴边境上"
 ```python
 import torch
 from transformers import AutoModelForCausalLM, AutoTokenizer
 Baichuan-13B 支持 int8 和 int4 量化，用户只需在推理代码中简单修改两行即可实现。请注意，如果是为了节省显存而进行量化，应加载原始精度模型到 CPU 后再开始量化；避免在 `from_pretrained` 时添加 `device_map='auto'` 或者其它会导致把原始精度模型直接加载到 GPU 的行为的参数。
 使用 int8 量化 (To use int8 quantization):
 ```python
 model = AutoModelForCausalLM.from_pretrained("AlpachinoNLP/Baichuan-13B-Instruction", torch_dtype=torch.float16, trust_remote_code=True)
 model = model.quantize(8).cuda()
 ```
 同样的，如需使用 int4 量化 (Similarly, to use int4 quantization):
 ```python
 model = AutoModelForCausalLM.from_pretrained("AlpachinoNLP/Baichuan-13B-Instruction", torch_dtype=torch.float16, trust_remote_code=True)
 model = model.quantize(4).cuda()
 | Baichuan-13B | 25.4     |
 具体参数和见下表
 | 模型名称     | 隐含层维度 | 层数 | 头数 | 词表大小 | 总参数量       | 训练数据（tokens） | 位置编码                                  | 最大长度 |
 | ------------ | ---------- | ---- | ---- | -------- | -------------- | ------------------ | ----------------------------------------- | -------- |
 | Baichuan-7B  | 4,096      | 32   | 32   | 64,000   | 7,000,559,616  | 1.2万亿            | [RoPE](https://arxiv.org/abs/2104.09864)  | 4,096    |
 ## [CMMLU](https://github.com/haonan-li/CMMLU)
+| Model 5-shot                                               |   STEM    | Humanities | Social Sciences |  Others  | China Specific | Average  |
+| ---------------------------------------------------------- | :-------: | :--------: | :-------------: | :------: | :------------: | :------: |
+| Baichuan-7B |   34.4    |    47.5    |      47.6       |   46.6   |      44.3      |   44.0   |
+| Vicuna-13B                                                 |   31.8    |    36.2    |      37.6       |   39.5   |      34.3      |   36.3   |
+| Chinese-Alpaca-Plus-13B                                    |   29.8    |    33.4    |      33.2       |   37.9   |      32.1      |   33.4   |
+| Chinese-LLaMA-Plus-13B                                     |   28.1    |    33.1    |      35.4       |   35.1   |      33.5      |   33.0   |
+| Ziya-LLaMA-13B-Pretrain                                    |   29.0    |    30.7    |      33.8       |   34.4   |      31.9      |   32.1   |
+| LLaMA-13B                                                  |   29.2    |    30.8    |      31.6       |   33.0   |      30.5      |   31.2   |
+| moss-moon-003-base (16B)                                   |   27.2    |    30.4    |      28.8       |   32.6   |      28.7      |   29.6   |
+| Baichuan-13B-Base                                          |   41.7    |    61.1    |      59.8       |   59.0   |      56.4      |   55.3   |
+| Baichuan-13B-Chat                                          |   42.8    |  **62.6**  |    **59.7**     | **59.0** |    **56.1**    | **55.8** |
+| **Baichuan-13B-Instruction**                               | **44.50** |   61.16    |      59.07      |  58.34   |     55.55      |  55.61   |
 | Model zero-shot                                              |   STEM    | Humanities | Social Sciences |  Others   | China Specific |  Average  |
 | ------------------------------------------------------------ | :-------: | :--------: | :-------------: | :-------: | :------------: | :-------: |