Aleksandar's picture
Roberta model called SRoBERTa trained on WOL dataset (OSCAR + Leipzig + srWac) for Serbian language. Attention heads distilled 6, batch size 64, group size 64, epochs 2, test split 0.05(~1mil groups)
4e75a39