File size: 4,600 Bytes
02be3d1 b3aad93 02be3d1 2f5f169 02be3d1 2f5f169 02be3d1 2f5f169 02be3d1 b3aad93 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 |
---
license: mit
language:
- ru
metrics:
- f1
- roc_auc
- precision
- recall
pipeline_tag: text-classification
tags:
- sentiment-analysis
- multi-label-classification
- sentiment analysis
- rubert
- sentiment
- bert
- russian
- multilabel
- classification
- emotion-classification
- emotion-recognition
- emotion
- emotion-detection
datasets:
- seara/ru_go_emotions
---
This is [RuBERT](https://huggingface.co/DeepPavlov/rubert-base-cased) model fine-tuned for __emotion classification__ of short __Russian__ texts.
The task is a __multi-label classification__ with the following labels:
```yaml
0: admiration
1: amusement
2: anger
3: annoyance
4: approval
5: caring
6: confusion
7: curiosity
8: desire
9: disappointment
10: disapproval
11: disgust
12: embarrassment
13: excitement
14: fear
15: gratitude
16: grief
17: joy
18: love
19: nervousness
20: optimism
21: pride
22: realization
23: relief
24: remorse
25: sadness
26: surprise
27: neutral
```
Label to Russian label:
```yaml
admiration: восхищение
amusement: веселье
anger: злость
annoyance: раздражение
approval: одобрение
caring: забота
confusion: непонимание
curiosity: любопытство
desire: желание
disappointment: разочарование
disapproval: неодобрение
disgust: отвращение
embarrassment: смущение
excitement: возбуждение
fear: страх
gratitude: признательность
grief: горе
joy: радость
love: любовь
nervousness: нервозность
optimism: оптимизм
pride: гордость
realization: осознание
relief: облегчение
remorse: раскаяние
sadness: грусть
surprise: удивление
neutral: нейтральность
```
## Usage
```python
from transformers import pipeline
model = pipeline(model="seara/rubert-base-cased-ru-go-emotions")
model("Привет, ты мне нравишься!")
# [{'label': 'love', 'score': 0.5456761717796326}]
```
## Dataset
This model was trained on translated GoEmotions dataset called [ru_go_emotions](https://huggingface.co/datasets/seara/ru_go_emotions).
An overview of the training data can be found on [Hugging Face card](https://huggingface.co/datasets/seara/ru_go_emotions) and on
[Github repository](https://github.com/searayeah/ru-goemotions).
## Training
Training were done in this [project](https://github.com/searayeah/bert-russian-sentiment-emotion) with this parameters:
```yaml
tokenizer.max_length: null
batch_size: 32
optimizer: adam
lr: 0.00001
weight_decay: 0
num_epochs: 5
```
## Eval results (on test split)
| |precision|recall|f1-score|auc-roc|support|
|--------------|---------|------|--------|-------|-------|
|admiration |0.66 |0.66 |0.66 |0.93 |504 |
|amusement |0.79 |0.81 |0.8 |0.97 |264 |
|anger |0.53 |0.3 |0.39 |0.91 |198 |
|annoyance |0.0 |0.0 |0.0 |0.82 |320 |
|approval |0.62 |0.25 |0.36 |0.82 |351 |
|caring |0.69 |0.13 |0.22 |0.86 |135 |
|confusion |0.56 |0.18 |0.28 |0.92 |153 |
|curiosity |0.52 |0.4 |0.45 |0.95 |284 |
|desire |0.67 |0.24 |0.35 |0.89 |83 |
|disappointment|0.88 |0.05 |0.09 |0.82 |151 |
|disapproval |0.56 |0.17 |0.26 |0.88 |267 |
|disgust |0.83 |0.2 |0.33 |0.92 |123 |
|embarrassment |0.0 |0.0 |0.0 |0.88 |37 |
|excitement |0.78 |0.14 |0.23 |0.9 |103 |
|fear |0.83 |0.37 |0.51 |0.92 |78 |
|gratitude |0.94 |0.9 |0.92 |0.99 |352 |
|grief |0.0 |0.0 |0.0 |0.72 |6 |
|joy |0.7 |0.4 |0.51 |0.94 |161 |
|love |0.77 |0.81 |0.79 |0.97 |238 |
|nervousness |0.0 |0.0 |0.0 |0.85 |23 |
|optimism |0.66 |0.52 |0.58 |0.92 |186 |
|pride |0.0 |0.0 |0.0 |0.76 |16 |
|realization |0.0 |0.0 |0.0 |0.74 |145 |
|relief |0.0 |0.0 |0.0 |0.72 |11 |
|remorse |0.58 |0.68 |0.63 |0.99 |56 |
|sadness |0.58 |0.44 |0.5 |0.92 |156 |
|surprise |0.62 |0.45 |0.52 |0.91 |141 |
|neutral |0.72 |0.47 |0.57 |0.84 |1787 |
|micro avg |0.7 |0.42 |0.53 |0.94 |6329 |
|macro avg |0.52 |0.31 |0.36 |0.88 |6329 |
|weighted avg |0.63 |0.42 |0.49 |0.88 |6329 | |