|
## BabyBERTA |
|
|
|
### Overview |
|
|
|
BabyBERTA is a light-weight version of RoBERTa trained on 5M words of American-English child-directed input. |
|
It is intended for language acquisition research, on a single desktop with a single GPU - no high-performance computing infrastructure needed. |
|
|
|
### Performance |
|
|
|
The provided model is the best-performing out of 10 that were evaluated on the [Zorro](https://github.com/phueb/Zorro) test suite. |
|
This model was trained for 400K steps, and achieves an overall accuracy of 80.3, |
|
comparable to RoBERTa-base, which achieves an overall accuracy of 82.6 on the latest version of Zorro (as of October, 2021). |
|
The latter value is slightly larger than that reported in the paper (Huebner et al., 2020) because the authors previously lower-cased all words in Zorro before evaluation. |
|
Lower-casing of proper nouns is detrimental to RoBERTa-base because RoBERTa-base has likely been exposed to proper nouns that are title-cased. |
|
Because BabyBERTa is not case-sensitive, performance is not influenced by this change. |
|
|
|
|
|
|
|
### Additional Information |
|
|
|
This model was trained by [Philip Huebner](https://philhuebner.com), currently at the [UIUC Language and Learning Lab](http://www.learninglanguagelab.org). |
|
|
|
More info can be found [here](https://github.com/phueb/BabyBERTa). |