Update README.md

#16
by Qianguo - opened
Files changed (1) hide show
  1. README.md +65 -3
README.md CHANGED
@@ -20,6 +20,7 @@ language:
20
 
21
  姜子牙通用大模型V1是基于LLaMa的130亿参数的大规模预训练模型,具备翻译,编程,文本分类,信息抽取,摘要,文案生成,常识问答和数学计算等能力。目前姜子牙通用大模型已完成大规模预训练、多任务有监督微调和人类反馈学习三阶段的训练过程。
22
 
 
23
  The Ziya-LLaMA-13B-v1 is a large-scale pre-trained model based on LLaMA with 13 billion parameters. It has the ability to perform tasks such as translation, programming, text classification, information extraction, summarization, copywriting, common sense Q&A, and mathematical calculation. The Ziya-LLaMA-13B-v1 has undergone three stages of training: large-scale continual pre-training (PT), multi-task supervised fine-tuning (SFT), and human feedback learning (RM, PPO).
24
 
25
  ## 模型分类 Model Taxonomy
@@ -81,6 +82,20 @@ We implemented the HFT training process on an internally developed framework, wh
81
 
82
  ## 使用 Usage
83
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
84
  ```python3
85
  from transformers import AutoTokenizer
86
  from transformers import LlamaForCausalLM
@@ -88,10 +103,11 @@ import torch
88
 
89
 
90
  device = torch.device("cuda")
 
91
 
92
  query="帮我写一份去西安的旅游计划"
93
- model = LlamaForCausalLM.from_pretrained('IDEA-CCNL/Ziya-LLaMA-13B-v1', torch_dtype=torch.float16, device_map="auto")
94
- tokenizer = AutoTokenizer.from_pretrained('IDEA-CCNL/Ziya-LLaMA-13B-v1', use_fast=False)
95
  inputs = '<human>:' + query.strip() + '\n<bot>:'
96
 
97
  input_ids = tokenizer(inputs, return_tensors="pt").input_ids.to(device)
@@ -108,7 +124,53 @@ generate_ids = model.generate(
108
  output = tokenizer.batch_decode(generate_ids)[0]
109
  print(output)
110
 
 
 
 
 
 
 
 
 
111
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
112
  ```
113
 
114
  ## 引用 Citation
@@ -137,4 +199,4 @@ You can also cite our [website](https://github.com/IDEA-CCNL/Fengshenbang-LM/):
137
  year={2021},
138
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
139
  }
140
- ```
 
20
 
21
  姜子牙通用大模型V1是基于LLaMa的130亿参数的大规模预训练模型,具备翻译,编程,文本分类,信息抽取,摘要,文案生成,常识问答和数学计算等能力。目前姜子牙通用大模型已完成大规模预训练、多任务有监督微调和人类反馈学习三阶段的训练过程。
22
 
23
+
24
  The Ziya-LLaMA-13B-v1 is a large-scale pre-trained model based on LLaMA with 13 billion parameters. It has the ability to perform tasks such as translation, programming, text classification, information extraction, summarization, copywriting, common sense Q&A, and mathematical calculation. The Ziya-LLaMA-13B-v1 has undergone three stages of training: large-scale continual pre-training (PT), multi-task supervised fine-tuning (SFT), and human feedback learning (RM, PPO).
25
 
26
  ## 模型分类 Model Taxonomy
 
82
 
83
  ## 使用 Usage
84
 
85
+ 由于LLaMA权重的许可限制,该模型不能用于商业用途,请严格遵守LLaMA的使用政策。考虑到LLaMA权重的许可证限制,我们无法直接发布完整的模型权重。因此,我们使用了[FastChat开源工具](https://github.com/lm-sys/FastChat/blob/main/fastchat/model/apply_delta.py)作为基础,并对其进行了进一步的优化。我们计算并发布了Ziya-LLaMA-13B-v1权重与原始LLaMA权重之间的差值。用户可以按照以下步骤操作以获得Ziya-LLaMA-13B-v1完整权重,具体步骤如下:
86
+
87
+ Step 1:获取[LLaMA](https://docs.google.com/forms/d/e/1FAIpQLSfqNECQnMkycAp2jP4Z9TFX0cGR4uf7b_fBxjY_OjhJILlKGA/viewform)权重并转成Hugging Face Transformers模型格式,可参考转换[脚本](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py)(若已经有huggingface权重则跳过)
88
+ ```
89
+ python src/transformers/models/llama/convert_llama_weights_to_hf.py \
90
+ --input_dir /path/to/downloaded/llama/weights --model_size 7B --output_dir /output/path
91
+ ```
92
+
93
+ Step 2:下载Ziya-LLaMA-13B-v1的delta权重以及step 1中转换好的原始LLaMA权重,使用如下脚本转换:https://github.com/IDEA-CCNL/Fengshenbang-LM/blob/main/fengshen/utils/apply_delta.py
94
+ ```
95
+ python3 -m apply_delta --base ~/model_weights/llama-13b --target ~/model_weights/Ziya-LLaMA-13B --delta ~/model_weights/Ziya-LLaMA-13B-v1
96
+ ```
97
+
98
+ Step 3: 加载step 2得到的模型推理
99
  ```python3
100
  from transformers import AutoTokenizer
101
  from transformers import LlamaForCausalLM
 
103
 
104
 
105
  device = torch.device("cuda")
106
+ ckpt = '基于delta参数合并后的完整模型权重'
107
 
108
  query="帮我写一份去西安的旅游计划"
109
+ model = LlamaForCausalLM.from_pretrained(ckpt, torch_dtype=torch.float16, device_map="auto")
110
+ tokenizer = AutoTokenizer.from_pretrained(ckpt, use_fast=False)
111
  inputs = '<human>:' + query.strip() + '\n<bot>:'
112
 
113
  input_ids = tokenizer(inputs, return_tensors="pt").input_ids.to(device)
 
124
  output = tokenizer.batch_decode(generate_ids)[0]
125
  print(output)
126
 
127
+ ```
128
+ NOTE: Due to the licensing restrictions of LLaMA weights, the utilization of the model for commercial purposes is precluded. Please strictly respect LLaMA's usage policy. Considering the licensing limitations on LLaMA weights, we are unable to directly release the complete model weights. Therefore, we utilized [the open-source FastChat tool](https://github.com/lm-sys/FastChat/blob/main/fastchat/model/apply_delta.py) and further optimized it to calculate the differences between Ziya-LLaMA-13B-v1 weights and the original LLaMA weights. Users can follow the steps to obtain the complete weights of Ziya-LLaMA-13B-v1. The steps are as follows:
129
+
130
+ Step 1: Obtain the [LLaMA](https://huggingface.co/docs/transformers/main/en/model_doc/llama#overview) weights and convert them into the Hugging Face Transformers format. You can refer to the [script](https://github.com/huggingface/transformers/blob/main/src/transformers/models/llama/convert_llama_weights_to_hf.py) (skip this step if you already have the Hugging Face weights).
131
+ ```
132
+ python src/transformers/models/llama/convert_llama_weights_to_hf.py \
133
+ --input_dir /path/to/downloaded/llama/weights --model_size 13B --output_dir /output/path
134
+ ```
135
 
136
+ Step 2: Download the delta weights for Ziya-LLaMA-13B-v1 and the pre-converted original LLaMA weights from step 1. Use the following script for conversion: https://github.com/IDEA-CCNL/Fengshenbang-LM/blob/main/fengshen/utils/apply_delta.py
137
+ ```
138
+ python3 -m apply_delta --base ~/model_weights/llama-13b --target ~/model_weights/Ziya-LLaMA-13B --delta ~/model_weights/Ziya-LLaMA-13B-v1(huggingface下载)
139
+ ```
140
+ Step 3: Load the model obtained in Step 2 for inference.
141
+ ```python3
142
+ from transformers import AutoTokenizer
143
+ from transformers import LlamaForCausalLM
144
+ import torch
145
+
146
+
147
+ device = torch.device("cuda")
148
+ ckpt = '基于delta合并后完整模型权重'
149
+
150
+ query="帮我写一份去西安的旅游计划"
151
+ model = LlamaForCausalLM.from_pretrained(ckpt, torch_dtype=torch.float16, device_map="auto")
152
+ tokenizer = AutoTokenizer.from_pretrained(ckpt, use_fast=False)
153
+ inputs = '<human>:' + query.strip() + '\n<bot>:'
154
+
155
+ input_ids = tokenizer(inputs, return_tensors="pt").input_ids.to(device)
156
+ generate_ids = model.generate(
157
+ input_ids,
158
+ max_new_tokens=1024,
159
+ do_sample = True,
160
+ top_p = 0.85,
161
+ temperature = 1.0,
162
+ repetition_penalty=1.,
163
+ eos_token_id=2,
164
+ bos_token_id=1,
165
+ pad_token_id=0)
166
+ output = tokenizer.batch_decode(generate_ids)[0]
167
+ print(output)
168
+
169
+ ```
170
+
171
+ ## 软件依赖
172
+ ```
173
+ pip install torch==1.12.1 tokenizers==0.13.3 git+https://github.com/huggingface/transformers
174
  ```
175
 
176
  ## 引用 Citation
 
199
  year={2021},
200
  howpublished={\url{https://github.com/IDEA-CCNL/Fengshenbang-LM}},
201
  }
202
+ ```