razent commited on
Commit
479faae
1 Parent(s): bca7ddf

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # SciFive Pubmed+PMC Base
2
+
3
+ ## Introduction
4
+ Paper: [SciFive: a text-to-text transformer model for biomedical literature](https://arxiv.org/abs/2106.03598)
5
+
6
+ Authors: _Long N. Phan, James T. Anibal, Hieu Tran, Shaurya Chanana, Erol Bahadroglu, Alec Peltekian, Grégoire Altan-Bonnet_
7
+
8
+ ## How to use
9
+ For more details, do check out [our Github repo](https://github.com/justinphan3110/SciFive).
10
+ ```python
11
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
12
+
13
+ tokenizer = AutoTokenizer.from_pretrained("razent/SciFive-base-Pubmed_PMC")
14
+ model = AutoModelForSeq2SeqLM.from_pretrained("razent/SciFive-base-Pubmed_PMC")
15
+
16
+ sentence = "Identification of APC2 , a homologue of the adenomatous polyposis coli tumour suppressor ."
17
+ text = "ncbi_ner: " + sentence + " </s>"
18
+
19
+ encoding = tokenizer.encode_plus(text, pad_to_max_length=True, return_tensors="pt")
20
+ input_ids, attention_masks = encoding["input_ids"].to("cuda"), encoding["attention_mask"].to("cuda")
21
+
22
+ outputs = model.generate(
23
+ input_ids=input_ids, attention_mask=attention_masks,
24
+ max_length=256,
25
+ early_stopping=True
26
+ )
27
+
28
+ for output in outputs:
29
+ line = tokenizer.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True)
30
+ print(line)
31
+ ```