yuanzhoulvpi commited on
Commit
5573a11
1 Parent(s): c8591a1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -1
README.md CHANGED
@@ -1,3 +1,62 @@
1
  ---
2
- license: bigscience-openrail-m
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: bigscience-bloom-rail-1.0
3
+ language:
4
+ - zh
5
  ---
6
+
7
+
8
+ # 体验
9
+ 🚀 点击链接,即可体验🔗 **[http://101.68.79.42:7861/](http://101.68.79.42:7861/)**
10
+
11
+
12
+ ## 介绍
13
+ 1. ✅ 对`bloom-7b`模型做了sft,本次版本为V2版本,相较于V1版本,效果更好!!!
14
+ 2. 🚀 训练代码和推理代码全部分享,可以查看链接[https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/chinese_bloom](https://github.com/yuanzhoulvpi2017/zero_nlp/tree/main/chinese_bloom)
15
+
16
+
17
+ ## 如何使用
18
+
19
+ ```python
20
+ from transformers import AutoModelForCausalLM, AutoTokenizer
21
+
22
+
23
+ checkpoint = "yuanzhoulvpi/chinese_bloom_7b_chat_v2"#"bigscience/bloomz-3b" #"bigscience/bloom-7b1"# "output_dir/checkpoint-8260"#
24
+ tokenizer = AutoTokenizer.from_pretrained(checkpoint)
25
+ model = AutoModelForCausalLM.from_pretrained(checkpoint).half().cuda()
26
+
27
+ PROMPT_DICT = {
28
+ "prompt_input": (
29
+ "Below is an instruction that describes a task, paired with an input that provides further context. "
30
+ "Write a response that appropriately completes the request.\n\n"
31
+ "### Instruction:\n{instruction}\n\n### Input:\n{input}\n\n### Response:"
32
+ ),
33
+ "prompt_no_input": (
34
+ "Below is an instruction that describes a task. "
35
+ "Write a response that appropriately completes the request.\n\n"
36
+ "### Instruction:\n{instruction}\n\n### Response:"
37
+ ),
38
+ }
39
+
40
+ from typing import Optional
41
+ def generate_input(instruction:Optional[str]= None, input_str:Optional[str] = None) -> str:
42
+ if input_str is None:
43
+ return PROMPT_DICT['prompt_no_input'].format_map({'instruction':instruction})
44
+ else:
45
+ return PROMPT_DICT['prompt_input'].format_map({'instruction':instruction, 'input':input_str})
46
+
47
+
48
+ for i in range(5):
49
+ print("*"*80)
50
+
51
+ inputs = tokenizer.encode(generate_input(instruction="你是谁"), return_tensors="pt")
52
+ outputs = model.generate(inputs,num_beams=3,
53
+ max_new_tokens=512,
54
+ do_sample=False,
55
+ top_k=10,
56
+ penalty_alpha=0.6,
57
+ temperature=0.8,
58
+ repetition_penalty=1.2)
59
+ print(tokenizer.decode(outputs[0]))
60
+ ```
61
+
62
+