File size: 2,308 Bytes
39c2469
 
 
 
 
 
 
 
 
 
 
6a06472
6ebb6d7
6a06472
6ebb6d7
 
 
6a06472
 
 
 
 
 
 
06ceca1
39c2469
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
6ebb6d7
 
 
 
 
 
 
 
 
 
 
39c2469
 
 
 
6a06472
39c2469
 
 
 
 
 
4f9cd7a
39c2469
e6ec43a
4f9cd7a
39c2469
 
 
4f9cd7a
39c2469
 
 
 
 
 
 
 
 
 
 
 
6ebb6d7
 
 
 
 
 
06ceca1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
---
license: mit
language:
- ru
metrics:
- f1
- roc_auc
- precision
- recall
pipeline_tag: text-classification
tags:
- sentiment-analysis
- multi-label-classification
- sentiment analysis
- rubert
- sentiment
- bert
- tiny
- russian
- multilabel
- classification
- emotion-classification
- emotion-recognition
- emotion
- emotion-detection
datasets:
- cedr
---

This is [RuBERT-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) model fine-tuned for __emotion classification__ of short __Russian__ texts.
The task is a __multi-label classification__ with the following labels:

```yaml
0: no_emotion
1: joy
2: sadness
3: surprise
4: fear
5: anger
```

Label to Russian label:

```yaml
no_emotion: нет эмоции
joy: радость
sadness: грусть
surprise: удивление
fear: страх
anger: злость
```

## Usage

```python
from transformers import pipeline
model = pipeline(model="seara/rubert-tiny2-cedr-russian-emotion")
model("Привет, ты мне нравишься!")
# [{'label': 'joy', 'score': 0.9605025053024292}]
```

## Dataset

This model was trained on [CEDR dataset](https://huggingface.co/datasets/cedr).

An overview of the training data can be found in it's [Hugging Face card](https://huggingface.co/datasets/cedr) 
or in the source [article](https://www.sciencedirect.com/science/article/pii/S1877050921013247).

## Training

Training were done in this [project](https://github.com/searayeah/bert-russian-sentiment-emotion) with this parameters:

```yaml
tokenizer.max_length: null
batch_size: 64
optimizer: adam
lr: 0.00001
weight_decay: 0
num_epochs: 30
```

## Eval results (on test split)

|         |no_emotion|joy   |sadness|surprise|fear   |anger|micro avg|macro avg|weighted avg|
|---------|----------|------|-------|--------|-------|-----|---------|---------|------------|
|precision|0.82      |0.84  |0.84   |0.79    |0.78   |0.55 |0.81     |0.77     |0.8         |
|recall   |0.84      |0.83  |0.85   |0.66    |0.67   |0.33 |0.78     |0.7      |0.78        |
|f1-score |0.83      |0.83  |0.84   |0.72    |0.72   |0.41 |0.79     |0.73     |0.79        |
|auc-roc  |0.92      |0.96  |0.96   |0.91    |0.91   |0.77 |0.94     |0.91     |0.93        |
|support  |734       |353   |379    |170     |141    |125  |1902     |1902     |1902        |