File size: 4,600 Bytes

---
license: mit
language:
- ru
metrics:
- f1
- roc_auc
- precision
- recall
pipeline_tag: text-classification
tags:
- sentiment-analysis
- multi-label-classification
- sentiment analysis
- rubert
- sentiment
- bert
- russian
- multilabel
- classification
- emotion-classification
- emotion-recognition
- emotion
- emotion-detection
datasets:
- seara/ru_go_emotions
---

This is [RuBERT](https://huggingface.co/DeepPavlov/rubert-base-cased) model fine-tuned for __emotion classification__ of short __Russian__ texts.
The task is a __multi-label classification__ with the following labels:

```yaml
0: admiration
1: amusement
2: anger
3: annoyance
4: approval
5: caring
6: confusion
7: curiosity
8: desire
9: disappointment
10: disapproval
11: disgust
12: embarrassment
13: excitement
14: fear
15: gratitude
16: grief
17: joy
18: love
19: nervousness
20: optimism
21: pride
22: realization
23: relief
24: remorse
25: sadness
26: surprise
27: neutral
```

Label to Russian label:

```yaml
admiration: восхищение
amusement: веселье
anger: злость
annoyance: раздражение
approval: одобрение
caring: забота
confusion: непонимание
curiosity: любопытство
desire: желание
disappointment: разочарование
disapproval: неодобрение
disgust: отвращение
embarrassment: смущение
excitement: возбуждение
fear: страх
gratitude: признательность
grief: горе
joy: радость
love: любовь
nervousness: нервозность
optimism: оптимизм
pride: гордость
realization: осознание
relief: облегчение
remorse: раскаяние
sadness: грусть
surprise: удивление
neutral: нейтральность
```

## Usage

```python
from transformers import pipeline
model = pipeline(model="seara/rubert-base-cased-ru-go-emotions")
model("Привет, ты мне нравишься!")
# [{'label': 'love', 'score': 0.5456761717796326}]
```

## Dataset

This model was trained on translated GoEmotions dataset called [ru_go_emotions](https://huggingface.co/datasets/seara/ru_go_emotions).

An overview of the training data can be found on [Hugging Face card](https://huggingface.co/datasets/seara/ru_go_emotions) and on 
[Github repository](https://github.com/searayeah/ru-goemotions).

## Training

Training were done in this [project](https://github.com/searayeah/bert-russian-sentiment-emotion) with this parameters:

```yaml
tokenizer.max_length: null
batch_size: 32
optimizer: adam
lr: 0.00001
weight_decay: 0
num_epochs: 5
```

## Eval results (on test split)

|              |precision|recall|f1-score|auc-roc|support|
|--------------|---------|------|--------|-------|-------|
|admiration    |0.66     |0.66  |0.66    |0.93   |504    |
|amusement     |0.79     |0.81  |0.8     |0.97   |264    |
|anger         |0.53     |0.3   |0.39    |0.91   |198    |
|annoyance     |0.0      |0.0   |0.0     |0.82   |320    |
|approval      |0.62     |0.25  |0.36    |0.82   |351    |
|caring        |0.69     |0.13  |0.22    |0.86   |135    |
|confusion     |0.56     |0.18  |0.28    |0.92   |153    |
|curiosity     |0.52     |0.4   |0.45    |0.95   |284    |
|desire        |0.67     |0.24  |0.35    |0.89   |83     |
|disappointment|0.88     |0.05  |0.09    |0.82   |151    |
|disapproval   |0.56     |0.17  |0.26    |0.88   |267    |
|disgust       |0.83     |0.2   |0.33    |0.92   |123    |
|embarrassment |0.0      |0.0   |0.0     |0.88   |37     |
|excitement    |0.78     |0.14  |0.23    |0.9    |103    |
|fear          |0.83     |0.37  |0.51    |0.92   |78     |
|gratitude     |0.94     |0.9   |0.92    |0.99   |352    |
|grief         |0.0      |0.0   |0.0     |0.72   |6      |
|joy           |0.7      |0.4   |0.51    |0.94   |161    |
|love          |0.77     |0.81  |0.79    |0.97   |238    |
|nervousness   |0.0      |0.0   |0.0     |0.85   |23     |
|optimism      |0.66     |0.52  |0.58    |0.92   |186    |
|pride         |0.0      |0.0   |0.0     |0.76   |16     |
|realization   |0.0      |0.0   |0.0     |0.74   |145    |
|relief        |0.0      |0.0   |0.0     |0.72   |11     |
|remorse       |0.58     |0.68  |0.63    |0.99   |56     |
|sadness       |0.58     |0.44  |0.5     |0.92   |156    |
|surprise      |0.62     |0.45  |0.52    |0.91   |141    |
|neutral       |0.72     |0.47  |0.57    |0.84   |1787   |
|micro avg     |0.7      |0.42  |0.53    |0.94   |6329   |
|macro avg     |0.52     |0.31  |0.36    |0.88   |6329   |
|weighted avg  |0.63     |0.42  |0.49    |0.88   |6329   |