File size: 4,587 Bytes
325b924
 
 
 
818e3b4
 
 
 
 
325b924
818e3b4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
fb1cd19
818e3b4
fb1cd19
 
818e3b4
 
 
fb1cd19
818e3b4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
---
license: mit
language:
- ru
metrics:
- f1
- roc_auc
- precision
- recall
pipeline_tag: text-classification
tags:
- sentiment-analysis
- multi-label-classification
- sentiment analysis
- rubert
- sentiment
- bert
- tiny
- russian
- multilabel
- classification
- emotion-classification
- emotion-recognition
- emotion
datasets:
- seara/ru_go_emotions
---

This is [RuBERT-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) model fine-tuned for __emotion classification__ of short __Russian__ texts.
The task is a __multi-label classification__ with the following labels:

```yaml
0: admiration
1: amusement
2: anger
3: annoyance
4: approval
5: caring
6: confusion
7: curiosity
8: desire
9: disappointment
10: disapproval
11: disgust
12: embarrassment
13: excitement
14: fear
15: gratitude
16: grief
17: joy
18: love
19: nervousness
20: optimism
21: pride
22: realization
23: relief
24: remorse
25: sadness
26: surprise
27: neutral
```

Label to Russian label:

```yaml
admiration: восхищение
amusement: веселье
anger: злость
annoyance: раздражение
approval: одобрение
caring: забота
confusion: непонимание
curiosity: любопытство
desire: желание
disappointment: разочарование
disapproval: неодобрение
disgust: отвращение
embarrassment: смущение
excitement: возбуждение
fear: страх
gratitude: признательность
grief: горе
joy: радость
love: любовь
nervousness: нервозность
optimism: оптимизм
pride: гордость
realization: осознание
relief: облегчение
remorse: раскаяние
sadness: грусть
surprise: удивление
neutral: нейтральность
```

## Usage

```python
from transformers import pipeline
model = pipeline(model="seara/rubert-tiny2-ru-go-emotions")
model("Привет, ты мне нравишься!")
# [{'label': 'love', 'score': 0.5955629944801331}]
```

## Dataset

This model was trained on translated GoEmotions dataset called [ru_go_emotions](https://huggingface.co/datasets/seara/ru_go_emotions).

An overview of the training data can be found on [Hugging Face card](https://huggingface.co/datasets/seara/ru_go_emotions) and on 
[Github repository](https://github.com/searayeah/ru-goemotions).

## Training

Training were done in this [project](https://github.com/searayeah/bert-russian-sentiment-emotion) with this parameters:

```yaml
tokenizer.max_length: null
batch_size: 64
optimizer: adam
lr: 0.00001
weight_decay: 0
num_epochs: 31
```

## Eval results (on test split)

|              |precision|recall|f1-score|auc-roc|support|
|--------------|---------|------|--------|-------|-------|
|admiration    |0.68     |0.61  |0.64    |0.92   |504    |
|amusement     |0.8      |0.84  |0.82    |0.96   |264    |
|anger         |0.55     |0.33  |0.42    |0.9    |198    |
|annoyance     |0.56     |0.03  |0.06    |0.81   |320    |
|approval      |0.6      |0.18  |0.28    |0.78   |351    |
|caring        |0.5      |0.04  |0.07    |0.84   |135    |
|confusion     |0.77     |0.07  |0.12    |0.9    |153    |
|curiosity     |0.51     |0.34  |0.41    |0.92   |284    |
|desire        |0.71     |0.18  |0.29    |0.88   |83     |
|disappointment|0.0      |0.0   |0.0     |0.76   |151    |
|disapproval   |0.48     |0.1   |0.17    |0.85   |267    |
|disgust       |0.94     |0.12  |0.22    |0.9    |123    |
|embarrassment |0.0      |0.0   |0.0     |0.84   |37     |
|excitement    |0.81     |0.2   |0.33    |0.88   |103    |
|fear          |0.73     |0.42  |0.54    |0.92   |78     |
|gratitude     |0.95     |0.89  |0.92    |0.99   |352    |
|grief         |0.0      |0.0   |0.0     |0.76   |6      |
|joy           |0.66     |0.52  |0.58    |0.93   |161    |
|love          |0.8      |0.79  |0.79    |0.97   |238    |
|nervousness   |0.0      |0.0   |0.0     |0.81   |23     |
|optimism      |0.67     |0.41  |0.51    |0.89   |186    |
|pride         |0.0      |0.0   |0.0     |0.89   |16     |
|realization   |0.0      |0.0   |0.0     |0.7    |145    |
|relief        |0.0      |0.0   |0.0     |0.84   |11     |
|remorse       |0.59     |0.71  |0.65    |0.99   |56     |
|sadness       |0.77     |0.37  |0.5     |0.89   |156    |
|surprise      |0.59     |0.35  |0.44    |0.88   |141    |
|neutral       |0.64     |0.58  |0.61    |0.81   |1787   |
|micro avg     |0.68     |0.43  |0.53    |0.93   |6329   |
|macro avg     |0.51     |0.29  |0.33    |0.87   |6329   |
|weighted avg  |0.62     |0.43  |0.48    |0.86   |6329   |