This is the cointegrated/rubert-tiny2 model fine-tuned for classification of emotions in Russian sentences. The task is multilabel classification, because one sentence can contain multiple emotions.

The model on the CEDR dataset described in the paper "Data-Driven Model for Emotion Detection in Russian Texts" by Sboev et al.

The model has been trained with Adam optimizer for 40 epochs with learning rate 1e-5 and batch size 64 in this notebook.

The quality of the predicted probabilities on the test dataset is the following:

label no emotion joy sadness surprise fear anger mean mean (emotions)
AUC 0.9286 0.9512 0.9564 0.8908 0.8955 0.7511 0.8956 0.8890
F1 micro 0.8624 0.9389 0.9362 0.9469 0.9575 0.9261 0.9280 0.9411
F1 macro 0.8562 0.8962 0.9017 0.8366 0.8359 0.6820 0.8348 0.8305
Downloads last month
Hosted inference API
Text Classification
This model can be loaded on the Inference API on-demand.

Dataset used to train cointegrated/rubert-tiny2-cedr-emotion-detection