metadata
license: mpl-2.0
CosmicRoBERTa
This model is a further pre-trained version of RoBERTa on a domain-specific corpus, including abstracts from the NTRS library, abstracts from SCOPUS etc...
The further pre-training corpus includes publications abstracts, books, and Wikipedia pages related to space systems. Corpus size is 14.3 GB. SpaceRoBERTa was further pre-trained on this domain-specific corpus from RoBERTa-Base. In our paper, it is then fine-tuned for a Concept Recognition task.
BibTeX entry and citation info
@ARTICLE{
9548078,
author={Berquand, Audrey and Darm, Paul and Riccardi, Annalisa},
journal={IEEE Access},
title={SpaceTransformers: Language Modeling for Space Systems},
year={2021},
volume={9},
number={},
pages={133111-133122},
doi={10.1109/ACCESS.2021.3115659}
}