--- license: apache-2.0 datasets: - tay-yozhik/NaturalText language: - ru --- # NaturalRoBERTa This is a pre-trained model of type [RoBERTa](https://arxiv.org/abs/1907.11692). NaturalRoBERTa is built on a dataset obtained from open sources: three news sub-corpuses [Taiga](https://github.com/TatianaShavrina/taiga_site) (Lenta.ru, Interfax, N+1) and [Russian Wikipedia texts](https://ru.wikipedia.org/). # Evaluation This model was evaluated on [RussianSuperGLUE tests](https://russiansuperglue.com/): | Task | Result | Metrics | |-------|----------|---------| | LiDiRus | 0,0 | Matthews Correlation Coefficient | | RCB | 0,217 / 0,484 | F1 / Accuracy | | PARus | 0,498 | Accuracy | | TERRa | 0,487 | Accuracy | | RUSSE | 0,587 | Accuracy | | RWSD | 0,669 | Accuracy |