cosmicroberta / README.md
PaulD's picture
Update README.md
2cbb598
---
license: mit
widget:
- text: "The closest planet to earth is <mask>."
- text: "Electrical power is stored on a spacecraft with <mask>."
---
### CosmicRoBERTa
This model is a further pre-trained version of RoBERTa for space science on a domain-specific corpus, which includes abstracts from the NTRS library, abstracts from SCOPUS, ECSS requirements, and other sources from this domain.
This totals to a pre-training corpus of around 75 mio words.
The model performs slightly better on a subset (0.6 of total data set) of the CR task presented in our paper [SpaceTransformers: Language Modeling for Space Systems](https://ieeexplore.ieee.org/document/9548078).
| | RoBERTa | CosmiRoBERTa | SpaceRoBERTa |
|-----------------------------------------------|----------------|---------------------|---------------------|
| Parameter | 0.475 | 0.515 | 0.485 |
| GN&C | 0.488 | 0.609 | 0.602 |
| System engineering | 0.523 | 0.559 | 0.555 |
| Propulsion | 0.403 | 0.521 | 0.465 |
| Project Scope | 0.493 | 0.541 | 0.497 |
| OBDH | 0.717 | 0.789 | 0.794 |
| Thermal | 0.432 | 0.509 | 0.491 |
| Quality control | 0.686 | 0.704 | 0.678 |
| Telecom. | 0.360 | 0.614 | 0.557 |
| Measurement | 0.833 | 0.849 | 0.858 |
| Structure & Mechanism | 0.489 | 0.581 | 0.566 |
| Space Environment | 0.543 | 0.681 | 0.605 |
| Cleanliness | 0.616 | 0.621 | 0.651 |
| Project Organisation / Documentation | 0.355 | 0.427 | 0.429 |
| Power | 0.638 | 0.735 | 0.661 |
| Safety / Risk (Control) | 0.647 | 0.727 | 0.676 |
| Materials / EEEs | 0.585 | 0.642 | 0.639 |
| Nonconformity | 0.365 | 0.333 | 0.419 |
| weighted | 0.584 | 0.652(+7%) | 0.633(+5%) |
| Valid. Loss | 0.605 | 0.505 | 0.542 |
### BibTeX entry and citation info
```
@ARTICLE{
9548078,
author={Berquand, Audrey and Darm, Paul and Riccardi, Annalisa},
journal={IEEE Access},
title={SpaceTransformers: Language Modeling for Space Systems},
year={2021},
volume={9},
number={},
pages={133111-133122},
doi={10.1109/ACCESS.2021.3115659}
}
```