This is the cointegrated/rubert-tiny2 model fine-tuned for classification of emotions in Russian sentences. The task is multilabel classification, because one sentence can contain multiple emotions.

The model on the CEDR dataset described in the paper "Data-Driven Model for Emotion Detection in Russian Texts" by Sboev et al.

The model has been trained with Adam optimizer for 40 epochs with learning rate 1e-5 and batch size 64 in this notebook.

The quality of the predicted probabilities on the test dataset is the following:

label no emotion joy sadness surprise fear anger mean mean (emotions)
AUC 0.9286 0.9512 0.9564 0.8908 0.8955 0.7511 0.8956 0.8890
F1 micro 0.8624 0.9389 0.9362 0.9469 0.9575 0.9261 0.9280 0.9411
F1 macro 0.8562 0.8962 0.9017 0.8366 0.8359 0.6820 0.8348 0.8305
Downloads last month
14,927
Safetensors
Model size
29.2M params
Tensor type
I64
·
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train cointegrated/rubert-tiny2-cedr-emotion-detection

Spaces using cointegrated/rubert-tiny2-cedr-emotion-detection 2