File size: 3,443 Bytes
c0e8869
4993c4b
32e3a40
2dc11ed
17e101a
32e3a40
 
0ac90a5
 
17e101a
2cbb598
0ac90a5
ff7eb9c
dcfbde9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0ac90a5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
---
license: mit
widget:
- text: "The closest planet to earth is  <mask>."
- text: "Electrical power is stored on a spacecraft with <mask>."
---

### CosmicRoBERTa

This model is a further pre-trained version of RoBERTa for space science on a domain-specific corpus, which includes abstracts from the NTRS library, abstracts from SCOPUS, ECSS requirements, and other sources from this domain. 
This totals to a pre-training corpus of around 75 mio words. 

The model performs slightly better on a subset (0.6 of total data set) of the CR task presented in our paper [SpaceTransformers: Language Modeling for Space Systems](https://ieeexplore.ieee.org/document/9548078). 

|                                               |     RoBERTa    |     CosmiRoBERTa    |     SpaceRoBERTa    |
|-----------------------------------------------|----------------|---------------------|---------------------|
|     Parameter                                 |     0.475      |     0.515           |     0.485           |
|     GN&C                                      |     0.488      |     0.609           |     0.602           |
|     System   engineering                      |     0.523      |     0.559           |     0.555           |
|     Propulsion                                |     0.403      |     0.521           |     0.465           |
|     Project   Scope                           |     0.493      |     0.541           |     0.497           |
|     OBDH                                      |     0.717      |     0.789           |     0.794           |
|     Thermal                                   |     0.432      |     0.509           |     0.491           |
|     Quality   control                         |     0.686      |     0.704           |     0.678           |
|     Telecom.                                  |     0.360      |     0.614           |     0.557           |
|     Measurement                               |     0.833      |     0.849           |     0.858           |
|     Structure   & Mechanism                   |     0.489      |     0.581           |     0.566           |
|     Space Environment                         |     0.543      |     0.681           |     0.605           |
|     Cleanliness                               |     0.616      |     0.621           |     0.651           |
|     Project   Organisation / Documentation    |     0.355      |     0.427           |     0.429           |
|     Power                                     |     0.638      |     0.735           |     0.661           |
|     Safety   / Risk (Control)                 |     0.647      |     0.727           |     0.676           |
|     Materials   / EEEs                        |     0.585      |     0.642           |     0.639           |
|     Nonconformity                             |     0.365      |     0.333           |     0.419           |
|     weighted                                  |     0.584      |     0.652(+7%)      |     0.633(+5%)      |
|     Valid.   Loss                             |     0.605      |     0.505           |     0.542           |


### BibTeX entry and citation info

```
@ARTICLE{
9548078,  
author={Berquand, Audrey and Darm, Paul and Riccardi, Annalisa},  
journal={IEEE Access},   
title={SpaceTransformers: Language Modeling for Space Systems},   
year={2021},  
volume={9},  
number={},  
pages={133111-133122},  
doi={10.1109/ACCESS.2021.3115659}
}
```