File size: 3,002 Bytes
abd9402
 
d578b41
 
 
 
 
abd9402
d578b41
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
---

license: mit
datasets:
- SkyWater21/ru_go_emotions_ekman
- seara/ru_go_emotions
language:
- ru
---

Fine-tuned [rubert-base-cased](https://huggingface.co/DeepPavlov/rubert-base-cased) for multi-label emotion classification task.

Model was trained on [ru_go_emotions_ekman](https://huggingface.co/SkyWater21/ru_go_emotions_ekman) dataset.
Original translation of comments to Russian was done at [seara/ru_go_emotions](https://huggingface.co/datasets/seara/ru_go_emotions). 
Dataset is Russian translation of [GoEmotions](https://huggingface.co/datasets/go_emotions) dataset.
Google Translate was used to generate the machine translation.

Original 26 emotions from GoEmotions were mapped to 6 base emotions as per Dr. Ekman theory.

Labels predicted by classifier:
```yaml

0: anger

1: disgust

2: fear

3: joy

4: sadness

5: surprise

6: neutral

```

Label mapping from 27 emotions from GoEmotion to 6 base emotions as per Dr. Ekman theory:
|GoEmotion|Ekman|
|---|---|
| admiration | joy|
| amusement | joy|
| anger | anger|
| annoyance | anger|
| approval | joy|
| caring | joy|
| confusion | surprise|
| curiosity | surprise|
| desire | joy|
| disappointment | sadness|
| disapproval | anger|
| disgust | disgust|
| embarrassment | sadness|
| excitement | joy|
| fear | fear|
| gratitude | joy|
| grief | sadness|
| joy | joy|
| love | joy|
| nervousness | fear|
| optimism | joy|
| pride | joy|
| realization | surprise|
| relief | joy|
| remorse | sadness|
| sadness | sadness|
| surprise | surprise|
| neutral | neutral|

Seed used for random number generator is 42:
```python

def set_seed(seed=42):

    random.seed(seed)

    np.random.seed(seed)

    torch.manual_seed(seed)

    if torch.cuda.is_available():

        torch.cuda.manual_seed_all(seed)

```

Training parameters:
```yaml

max_length: null

batch_size: 32

shuffle: True

num_workers: 2

pin_memory: False

drop_last: False



optimizer: adam

lr: 0.00001

weight_decay: 0



problem_type: multi_label_classification



num_epochs: 4

```


Evaluation results on test split of [ru_go_emotions_ekman](https://huggingface.co/datasets/SkyWater21/ru_go_emotions_ekman/viewer/simplified_ekman/test)
|              |Precision|Recall|F1-Score|AUC-ROC|Support|
|--------------|---------|------|--------|-------|-------|
|anger         |     0.56|  0.44|    0.49|   0.86|   726|
|disgust       |     0.65|  0.24|    0.36|   0.92|   123|
|fear          |     0.64|  0.60|    0.62|   0.93|    98|
|joy           |     0.79|  0.80|    0.80|   0.91|  2104|
|sadness       |     0.68|  0.44|    0.53|   0.89|   379|
|surprise      |     0.60|  0.52|    0.56|   0.88|   677|
|neutral       |     0.65|  0.58|    0.61|   0.82|  1787|
|micro avg     |     0.69|  0.62|    0.65|   0.92|  5894|
|macro avg     |     0.65|  0.52|    0.57|   0.89|  5894|
|weighted avg  |     0.69|  0.62|    0.65|   0.87|  5894|
|samples avg   |     0.65|  0.64|    0.64|    nan|  5894|