seara commited on
Commit
818e3b4
1 Parent(s): a7504e1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +154 -3
README.md CHANGED
@@ -1,8 +1,159 @@
1
  ---
2
  license: mit
3
- datasets:
4
- - seara/ru_go_emotions
5
  language:
6
  - ru
 
 
 
 
 
7
  pipeline_tag: text-classification
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: mit
 
 
3
  language:
4
  - ru
5
+ metrics:
6
+ - f1
7
+ - roc_auc
8
+ - precision
9
+ - recall
10
  pipeline_tag: text-classification
11
+ tags:
12
+ - sentiment-analysis
13
+ - multi-label-classification
14
+ - sentiment analysis
15
+ - rubert
16
+ - sentiment
17
+ - bert
18
+ - tiny
19
+ - russian
20
+ - multilabel
21
+ - classification
22
+ - emotion-classification
23
+ - emotion-recognition
24
+ - emotion
25
+ datasets:
26
+ - seara/ru_go_emotions
27
+ - go_emotions
28
+ ---
29
+
30
+ This is [RuBERT-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) model fine-tuned for __emotion classification__ of short __Russian__ texts.
31
+ The task is a __multi-label classification__ with the following labels:
32
+
33
+ ```yaml
34
+ 0: admiration
35
+ 1: amusement
36
+ 2: anger
37
+ 3: annoyance
38
+ 4: approval
39
+ 5: caring
40
+ 6: confusion
41
+ 7: curiosity
42
+ 8: desire
43
+ 9: disappointment
44
+ 10: disapproval
45
+ 11: disgust
46
+ 12: embarrassment
47
+ 13: excitement
48
+ 14: fear
49
+ 15: gratitude
50
+ 16: grief
51
+ 17: joy
52
+ 18: love
53
+ 19: nervousness
54
+ 20: optimism
55
+ 21: pride
56
+ 22: realization
57
+ 23: relief
58
+ 24: remorse
59
+ 25: sadness
60
+ 26: surprise
61
+ 27: neutral
62
+ ```
63
+
64
+ Label to Russian label:
65
+
66
+ ```yaml
67
+ admiration: восхищение
68
+ amusement: веселье
69
+ anger: злость
70
+ annoyance: раздражение
71
+ approval: одобрение
72
+ caring: забота
73
+ confusion: непонимание
74
+ curiosity: любопытство
75
+ desire: желание
76
+ disappointment: разочарование
77
+ disapproval: неодобрение
78
+ disgust: отвращение
79
+ embarrassment: смущение
80
+ excitement: возбуждение
81
+ fear: страх
82
+ gratitude: признательность
83
+ grief: горе
84
+ joy: радость
85
+ love: любовь
86
+ nervousness: нервозность
87
+ optimism: оптимизм
88
+ pride: гордость
89
+ realization: осознание
90
+ relief: облегчение
91
+ remorse: раскаяние
92
+ sadness: грусть
93
+ surprise: удивление
94
+ neutral: нейтральность
95
+ ```
96
+
97
+ ## Usage
98
+
99
+ ```python
100
+ from transformers import pipeline
101
+ model = pipeline(model="seara/rubert-tiny2-ru-go-emotions")
102
+ model("Привет, ты мне нравишься!")
103
+ # [{'label': 'love', 'score': 0.5955629944801331}]
104
+ ```
105
+
106
+ ## Dataset
107
+
108
+ This model was trained on translated GoEmotions dataset called [Ru-GoEmotions](https://huggingface.co/datasets/seara/ru_go_emotions).
109
+
110
+ An overview of the training data can be found on Hugging Face card and on [Github repository](https://github.com/searayeah/Ru-GoEmotions).
111
+
112
+ ## Training
113
+
114
+ Training were done in this [project](https://github.com/searayeah/vkr-bert) with this parameters:
115
+
116
+ ```yaml
117
+ tokenizer.max_length: null
118
+ batch_size: 64
119
+ optimizer: adam
120
+ lr: 0.00001
121
+ weight_decay: 0
122
+ num_epochs: 31
123
+ ```
124
+
125
+ ## Eval results (on test split)
126
+
127
+ | |precision|recall|f1-score|auc-roc|support|
128
+ |--------------|---------|------|--------|-------|-------|
129
+ |admiration |0.68 |0.61 |0.64 |0.92 |504 |
130
+ |amusement |0.8 |0.84 |0.82 |0.96 |264 |
131
+ |anger |0.55 |0.33 |0.42 |0.9 |198 |
132
+ |annoyance |0.56 |0.03 |0.06 |0.81 |320 |
133
+ |approval |0.6 |0.18 |0.28 |0.78 |351 |
134
+ |caring |0.5 |0.04 |0.07 |0.84 |135 |
135
+ |confusion |0.77 |0.07 |0.12 |0.9 |153 |
136
+ |curiosity |0.51 |0.34 |0.41 |0.92 |284 |
137
+ |desire |0.71 |0.18 |0.29 |0.88 |83 |
138
+ |disappointment|0.0 |0.0 |0.0 |0.76 |151 |
139
+ |disapproval |0.48 |0.1 |0.17 |0.85 |267 |
140
+ |disgust |0.94 |0.12 |0.22 |0.9 |123 |
141
+ |embarrassment |0.0 |0.0 |0.0 |0.84 |37 |
142
+ |excitement |0.81 |0.2 |0.33 |0.88 |103 |
143
+ |fear |0.73 |0.42 |0.54 |0.92 |78 |
144
+ |gratitude |0.95 |0.89 |0.92 |0.99 |352 |
145
+ |grief |0.0 |0.0 |0.0 |0.76 |6 |
146
+ |joy |0.66 |0.52 |0.58 |0.93 |161 |
147
+ |love |0.8 |0.79 |0.79 |0.97 |238 |
148
+ |nervousness |0.0 |0.0 |0.0 |0.81 |23 |
149
+ |optimism |0.67 |0.41 |0.51 |0.89 |186 |
150
+ |pride |0.0 |0.0 |0.0 |0.89 |16 |
151
+ |realization |0.0 |0.0 |0.0 |0.7 |145 |
152
+ |relief |0.0 |0.0 |0.0 |0.84 |11 |
153
+ |remorse |0.59 |0.71 |0.65 |0.99 |56 |
154
+ |sadness |0.77 |0.37 |0.5 |0.89 |156 |
155
+ |surprise |0.59 |0.35 |0.44 |0.88 |141 |
156
+ |neutral |0.64 |0.58 |0.61 |0.81 |1787 |
157
+ |micro avg |0.68 |0.43 |0.53 |0.93 |6329 |
158
+ |macro avg |0.51 |0.29 |0.33 |0.87 |6329 |
159
+ |weighted avg |0.62 |0.43 |0.48 |0.86 |6329 |