cointegrated commited on
Commit
9f2ece2
1 Parent(s): d16047f

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -0
README.md ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language: ["ru"]
3
+ tags:
4
+ - russian
5
+ - classification
6
+ - sentiment
7
+ - multiclass
8
+ datasets:
9
+ - cedr
10
+ widget:
11
+ - text: "Бесишь меня, падла"
12
+ - text: "Как здорово, что все мы здесь сегодня собрались"
13
+ - text: "Как-то стрёмно, давай уйдём отсюда?"
14
+ - text: "Грусть-тоска меня съедает"
15
+ - text: "Данный фрагмент текста не содержит абсолютно никаких эмоций"
16
+ - text: "Надо же, неужели так тоже бывает!"
17
+
18
+ ---
19
+ This is the [cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) model fine-tuned for classification of emotions in Russian sentences. The task is multilabel classification, because one sentence can contain multiple emotions.
20
+
21
+ The model on the [CEDR dataset](https://huggingface.co/datasets/cedr) described in the paper ["Data-Driven Model for Emotion Detection in Russian Texts"](https://doi.org/10.1016/j.procs.2021.06.075) by Sboev et al.
22
+
23
+ The model has been trained with Adam optimizer for 40 epochs with learning rate `1e-5` and batch size 64 [in this notebook](https://colab.research.google.com/drive/1AFW70EJaBn7KZKRClDIdDUpbD46cEsat?usp=sharing).
24
+
25
+ ROC AUC of the predicted probabilities on the test dataset is the following:
26
+
27
+ | label | no emotion | joy |sadness |surprise| fear |anger | mean |
28
+ |-------|------------|--------|--------|--------|--------|--------| --------|
29
+ | AUC | 0.9406 | 0.9518 | 0.9372 | 0.8634 | 0.9663 | 0.6761 | 0.8892 |