shrishail commited on
Commit
02bc3ad
β€’
1 Parent(s): 32e4fb7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +38 -1
README.md CHANGED
@@ -1,3 +1,40 @@
1
  ---
2
- license: cc
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: "en"
3
+ tags:
4
+ - paraphrase-generation
5
+ - text-generation
6
+
7
  ---
8
+ # Simple model for Paraphrase Generation
9
+ ​
10
+ ## Model description
11
+ ​
12
+ T5-based Model for generating English paraphrased sentences. It is trained on the labeled [MSRP](https://www.microsoft.com/en-us/download/details.aspx?id=52398) and [Google PAWS](https://github.com/google-research-datasets/paws) dataset.
13
+ ​
14
+ ## How to use
15
+ ​
16
+ ```python
17
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
18
+ ​
19
+ tokenizer = AutoTokenizer.from_pretrained("Vamsi/T5_Paraphrase_Paws")
20
+ model = AutoModelForSeq2SeqLM.from_pretrained("Vamsi/T5_Paraphrase_Paws")
21
+ ​
22
+ sentence = "This is something which i cannot understand at all"
23
+ text = "paraphrase: " + sentence + " </s>"
24
+ encoding = tokenizer.encode_plus(text,pad_to_max_length=True, return_tensors="pt")
25
+ input_ids, attention_masks = encoding["input_ids"].to("cuda"), encoding["attention_mask"].to("cuda")
26
+ outputs = model.generate(
27
+ input_ids=input_ids, attention_mask=attention_masks,
28
+ max_length=256,
29
+ do_sample=True,
30
+ top_k=120,
31
+ top_p=0.95,
32
+ early_stopping=True,
33
+ num_return_sequences=5
34
+ )
35
+ for output in outputs:
36
+ line = tokenizer.decode(output, skip_special_tokens=True,clean_up_tokenization_spaces=True)
37
+ print(line)
38
+ ​
39
+ ```
40
+