ClueAI
/

ChatYuan-7B

Text2Text Generation

text-generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ClueAI commited on Jun 2, 2023

Commit

fd2e4d2

•

1 Parent(s): 1625f3e

Update README.md

Files changed (1) hide show

README.md +43 -9

README.md CHANGED Viewed

@@ -1,20 +1,54 @@
 ---
 inference:
   parameters:
-    max_length: 100
     temperature: 0.7
     top_p: 1
 widget:
-- text: 用户：帮我写一个英文营销方案，针对iphone\n小元：
-- text: 用户：在他们放弃追讨信用卡账单之前，我可以拖欠多久？\n小元：
-- text: 用户：帮我用英语写一封求职信，我想找一份深度学习工程师的工作\n小元：
-- text: 用户：帮我双两个数之和，54+109\n小元：
-- text: 用户：模拟小李和小王关于通用人工智能的潜力和问题的对话，要求先来一个开场白，然后双方展开讨论\n小元：
-- text: 用户：帮我生成下面句子的5个相似句子，“linux云主机中了挖矿病毒怎么办”\n小元：
-- text: 用户：你好\n小元：我是元语智能公司研发的ChatYuan模型，很高兴为你服务。\n用户：请介绍一下你自己吧？\n小元：
 language:
 - en
 - zh
 ---
-ChatYuan-7B是一个支持中英双语的功能型对话语言大模型。

 ---
 inference:
   parameters:
+    max_length: 250
     temperature: 0.7
     top_p: 1
 widget:
 language:
 - en
 - zh
 ---
+ChatYuan-7B是一个支持中英双语的功能型对话语言大模型。它是基于LLama-7B模型上继续进行三阶段训练的模型。
+三阶段如下：
+1. 在中文通用语料上继续预训练500亿中文token
+2. 在数百种任务集上进行任务式指令微调训练
+3. 在人类反馈数据集上进行指令微调训练
+## 更多细节参考[GitHub](https://github.com/clue-ai/ChatYuan-7B)
+## 使用方式
+为了遵守LLaMA模型许可证，我们将ChatYuan-7B权重发布为增量权重。您可以将我们的增量权重与原始的LLaMA权重相加，得到ChatYuan-7B权重。
+1. 通过原始[LLaMA-7B](https://github.com/facebookresearch/llama)生成LLaMA的hf模型(LLaMA-7B-HF)，可以参考[指导](https://huggingface.co/docs/transformers/main/model_doc/llama)
+2. 合并LLaMA-7B的hf模型和ChatYuan-7B模型
+### 合并脚本
+```shell
+python3 apply_delta.py --base ~/model_weights/LLaMA-7B-HF --delta ~/model_weights/ChatYuan-7B --target ~/model_weights/ChatYuan-7B-merge
+```
+## 加载方式
+```python
+from transformers import LlamaForCausalLM, AutoTokenizer
+import torch
+import sys
+ckpt = "~/model_weights/ChatYuan-7B-merge"
+device = torch.device('cuda')
+model = LlamaForCausalLM.from_pretrained(ckpt)
+tokenizer = AutoTokenizer.from_pretrained(ckpt)
+```
+## 推理方式
+```python
+prompt = "用户:  \n小元: "
+input_ids = tokenizer(prompt, return_tensors="pt").input_ids.to(device)
+generate_ids = model.generate(input_ids, max_new_tokens=1024, do_sample = True, temperature = 0.7)
+output = tokenizer.batch_decode(generate_ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
+response = output[len(prompt):]
+print(response)
+```