arazd commited on
Commit
9e897f3
1 Parent(s): 8c95b58

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -1,3 +1,21 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ This is the finetuned model presented in MIReAD: a simple method for learning high-quality representations from
5
+ scientific documents (ACL 2023).
6
+
7
+ We trained MIReAD on >500,000 PubMed and arXiv abstracts across over 2,000 journal classes. MIReAD was initialized with SciBERT weights and finetuned to predict journal class based on the abstract and title of the paper. MIReAD uses SciBERT's tokenizer.
8
+ We show that MIREAD produces representations that can be used for similar papers retrieval, topic categorization and literature search.
9
+
10
+ Overall, with MIReAD you can:
11
+ * extract semantically meaningful representation using paper's abstact
12
+ * predict journal class based on paper's abstract
13
+
14
+ To load the MIReAD model:
15
+ ```python
16
+ from transformers import BertForSequenceClassification, AutoTokenizer
17
+
18
+ mpath = 'arazd/miread'
19
+ model_hub = BertForSequenceClassification.from_pretrained(mpath)
20
+ tokenizer = AutoTokenizer.from_pretrained(mpath)
21
+ ```