--- license: mpl-2.0 --- ### CosmicRoBERTa This model is a further pre-trained version of RoBERTa on a domain-specific corpus, including abstracts from the NTRS library, abstracts from SCOPUS etc... The further pre-training corpus includes publications abstracts, books, and Wikipedia pages related to space systems. Corpus size is 14.3 GB. SpaceRoBERTa was further pre-trained on this domain-specific corpus from [RoBERTa-Base](https://huggingface.co/roberta-base). In our paper, it is then fine-tuned for a Concept Recognition task. ### BibTeX entry and citation info ``` @ARTICLE{ 9548078, author={Berquand, Audrey and Darm, Paul and Riccardi, Annalisa}, journal={IEEE Access}, title={SpaceTransformers: Language Modeling for Space Systems}, year={2021}, volume={9}, number={}, pages={133111-133122}, doi={10.1109/ACCESS.2021.3115659} } ```