fbaigt commited on
Commit
62bad82
1 Parent(s): 4eec3a1

Add a model card

Browse files
Files changed (1) hide show
  1. README.md +30 -0
README.md ADDED
@@ -0,0 +1,30 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ datasets:
5
+ - pubmed
6
+ - chemical patent
7
+ - recipe
8
+ ---
9
+
10
+ ## Proc-RoBERTa
11
+ Proc-RoBERTa is a pre-trained language model for procedural text. It was built by fine-tuning the RoBERTa-based model on a procedural corpus (PubMed articles/chemical patents/cooking recipes), which contains 1.05B tokens. More details can be found in the following [paper](https://arxiv.org/abs/2109.04711):
12
+
13
+ ```
14
+ @article{Bai2021PretrainOA,
15
+ title={Pre-train or Annotate? Domain Adaptation with a Constrained Budget},
16
+ author={Fan Bai and Alan Ritter and Wei Xu},
17
+ journal={ArXiv},
18
+ year={2021},
19
+ volume={abs/2109.04711}
20
+ }
21
+ ```
22
+
23
+ ## Usage
24
+ ```
25
+ from transformers import *
26
+ tokenizer = AutoTokenizer.from_pretrained("fbaigt/proc_roberta")
27
+ model = AutoModelForTokenClassification.from_pretrained("fbaigt/proc_roberta")
28
+ ```
29
+
30
+ More usage details can be found [here](https://github.com/bflashcp3f/ProcBERT).