chujiezheng commited on
Commit
5ff2bb8
1 Parent(s): 406b9af

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -1
README.md CHANGED
@@ -4,4 +4,37 @@ language:
4
  pipeline_tag: conversational
5
  tags:
6
  - pytorch
7
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4
  pipeline_tag: conversational
5
  tags:
6
  - pytorch
7
+ ---
8
+
9
+ [blenderbot-1B-distill](https://huggingface.co/facebook/blenderbot-1B-distill) fine-tuned on the [ESConv dataset](https://github.com/thu-coai/Emotional-Support-Conversation) and **[AugESC dataset](https://github.com/thu-coai/AugESC)**. Usage example:
10
+
11
+ ```python
12
+ import torch
13
+ from transformers import AutoTokenizer
14
+ from transformers.models.blenderbot import BlenderbotTokenizer, BlenderbotForConditionalGeneration
15
+
16
+ def _norm(x):
17
+ return ' '.join(x.strip().split())
18
+
19
+ tokenizer = BlenderbotTokenizer.from_pretrained('thu-coai/blenderbot-400M-esconv')
20
+ model = BlenderbotForConditionalGeneration.from_pretrained('thu-coai/blenderbot-400M-esconv')
21
+ model.eval()
22
+
23
+ utterances = [
24
+ "I am having a lot of anxiety about quitting my current job. It is too stressful but pays well",
25
+ "What makes your job stressful for you?",
26
+ "I have to deal with many people in hard financial situations and it is upsetting",
27
+ "Do you help your clients to make it to a better financial situation?",
28
+ "I do, but often they are not going to get back to what they want. Many people are going to lose their home when safeguards are lifted",
29
+ ]
30
+ input_sequence = ' '.join([' ' + e for e in utterances]) + tokenizer.eos_token # add space prefix and separate utterances with two spaces
31
+ input_ids = tokenizer.convert_tokens_to_ids(tokenizer.tokenize(input_sequence))[-128:]
32
+ input_ids = torch.LongTensor([input_ids])
33
+
34
+ model_output = model.generate(input_ids, num_beams=1, do_sample=True, top_p=0.9, num_return_sequences=5, return_dict=False)
35
+ generation = tokenizer.batch_decode(model_output, skip_special_tokens=True)
36
+ generation = [_norm(e) for e in generation]
37
+ print(generation)
38
+
39
+ utterances.append(generation[0]) # for future loop
40
+ ```