File size: 248 Bytes
fb1e3e0
 
 
1
2
3
This project pretrains a [`roberta-base`](https://huggingface.co/roberta-base) on the *Alemannic* (`als`) data subset of the [OSCAR](https://oscar-corpus.com/) corpus in JAX/Flax.

We will be using the masked-language modeling loss for pretraining.