--- language: - en tags: - Machine Learning - Research Papers - Scientific Language Model license: apache-2.0 --- ## MLRoBERTa (RoBERTa pretrained on ML Papers) ## How to use: ``` from transformers import AutoTokenizer, AutoModel tok = AutoTokenizer.from_pretrained('shrutisingh/MLRoBERTa') model = AutoModel.from_pretrained('shrutisingh/MLRoBERTa') ``` ## Pretraining Details: This is a RoBERTa model trained on scientific documents. The dataset is composed of NeurIPS (1987-2019), CVPR (2013-2020), ICLR (2016-2020), ACL Anthology data (till 2019) paper title and abstracts, and ICLR paper reviews. ## Citation: ``` @inproceedings{singh2021compare, title={COMPARE: a taxonomy and dataset of comparison discussions in peer reviews}, author={Singh, Shruti and Singh, Mayank and Goyal, Pawan}, booktitle={2021 ACM/IEEE Joint Conference on Digital Libraries (JCDL)}, pages={238--241}, year={2021}, organization={IEEE} } ```