metadata
license: mit
language:
- ru
metrics:
- f1
- roc_auc
- precision
- recall
pipeline_tag: text-classification
tags:
- sentiment-analysis
- multi-label-classification
- sentiment analysis
- rubert
- sentiment
- bert
- russian
- multilabel
- classification
- emotion-classification
- emotion-recognition
- emotion
- emotion-detection
datasets:
- seara/ru_go_emotions
This is RuBERT model fine-tuned for emotion classification of short Russian texts. The task is a multi-label classification with the following labels:
0: admiration
1: amusement
2: anger
3: annoyance
4: approval
5: caring
6: confusion
7: curiosity
8: desire
9: disappointment
10: disapproval
11: disgust
12: embarrassment
13: excitement
14: fear
15: gratitude
16: grief
17: joy
18: love
19: nervousness
20: optimism
21: pride
22: realization
23: relief
24: remorse
25: sadness
26: surprise
27: neutral
Label to Russian label:
admiration: восхищение
amusement: веселье
anger: злость
annoyance: раздражение
approval: одобрение
caring: забота
confusion: непонимание
curiosity: любопытство
desire: желание
disappointment: разочарование
disapproval: неодобрение
disgust: отвращение
embarrassment: смущение
excitement: возбуждение
fear: страх
gratitude: признательность
grief: горе
joy: радость
love: любовь
nervousness: нервозность
optimism: оптимизм
pride: гордость
realization: осознание
relief: облегчение
remorse: раскаяние
sadness: грусть
surprise: удивление
neutral: нейтральность
Usage
from transformers import pipeline
model = pipeline(model="seara/rubert-base-cased-ru-go-emotions")
model("Привет, ты мне нравишься!")
# [{'label': 'love', 'score': 0.5456761717796326}]
Dataset
This model was trained on translated GoEmotions dataset called ru_go_emotions.
An overview of the training data can be found on Hugging Face card and on Github repository.
Training
Training were done in this project with this parameters:
tokenizer.max_length: null
batch_size: 32
optimizer: adam
lr: 0.00001
weight_decay: 0
num_epochs: 5
Eval results (on test split)
precision | recall | f1-score | auc-roc | support | |
---|---|---|---|---|---|
admiration | 0.66 | 0.66 | 0.66 | 0.93 | 504 |
amusement | 0.79 | 0.81 | 0.8 | 0.97 | 264 |
anger | 0.53 | 0.3 | 0.39 | 0.91 | 198 |
annoyance | 0.0 | 0.0 | 0.0 | 0.82 | 320 |
approval | 0.62 | 0.25 | 0.36 | 0.82 | 351 |
caring | 0.69 | 0.13 | 0.22 | 0.86 | 135 |
confusion | 0.56 | 0.18 | 0.28 | 0.92 | 153 |
curiosity | 0.52 | 0.4 | 0.45 | 0.95 | 284 |
desire | 0.67 | 0.24 | 0.35 | 0.89 | 83 |
disappointment | 0.88 | 0.05 | 0.09 | 0.82 | 151 |
disapproval | 0.56 | 0.17 | 0.26 | 0.88 | 267 |
disgust | 0.83 | 0.2 | 0.33 | 0.92 | 123 |
embarrassment | 0.0 | 0.0 | 0.0 | 0.88 | 37 |
excitement | 0.78 | 0.14 | 0.23 | 0.9 | 103 |
fear | 0.83 | 0.37 | 0.51 | 0.92 | 78 |
gratitude | 0.94 | 0.9 | 0.92 | 0.99 | 352 |
grief | 0.0 | 0.0 | 0.0 | 0.72 | 6 |
joy | 0.7 | 0.4 | 0.51 | 0.94 | 161 |
love | 0.77 | 0.81 | 0.79 | 0.97 | 238 |
nervousness | 0.0 | 0.0 | 0.0 | 0.85 | 23 |
optimism | 0.66 | 0.52 | 0.58 | 0.92 | 186 |
pride | 0.0 | 0.0 | 0.0 | 0.76 | 16 |
realization | 0.0 | 0.0 | 0.0 | 0.74 | 145 |
relief | 0.0 | 0.0 | 0.0 | 0.72 | 11 |
remorse | 0.58 | 0.68 | 0.63 | 0.99 | 56 |
sadness | 0.58 | 0.44 | 0.5 | 0.92 | 156 |
surprise | 0.62 | 0.45 | 0.52 | 0.91 | 141 |
neutral | 0.72 | 0.47 | 0.57 | 0.84 | 1787 |
micro avg | 0.7 | 0.42 | 0.53 | 0.94 | 6329 |
macro avg | 0.52 | 0.31 | 0.36 | 0.88 | 6329 |
weighted avg | 0.63 | 0.42 | 0.49 | 0.88 | 6329 |