mikhmanoff commited on
Commit
f17d028
1 Parent(s): bb98109

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -42
README.md CHANGED
@@ -27,7 +27,6 @@ datasets:
27
  - KaggleRussianNews
28
  ---
29
 
30
- This is [RuBERT-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) model fine-tuned for __sentiment classification__ of short __Russian__ texts.
31
  The task is a __multi-class classification__ with the following labels:
32
 
33
  ```yaml
@@ -51,44 +50,4 @@ from transformers import pipeline
51
  model = pipeline(model="seara/rubert-tiny2-russian-sentiment")
52
  model("Привет, ты мне нравишься!")
53
  # [{'label': 'positive', 'score': 0.9398769736289978}]
54
- ```
55
-
56
- ## Dataset
57
-
58
- This model was trained on the union of the following datasets:
59
-
60
- - Kaggle Russian News Dataset
61
- - Linis Crowd 2015
62
- - Linis Crowd 2016
63
- - RuReviews
64
- - RuSentiment
65
-
66
- An overview of the training data can be found on [S. Smetanin Github repository](https://github.com/sismetanin/sentiment-analysis-in-russian).
67
-
68
- __Download links for all Russian sentiment datasets collected by Smetanin can be found in this [repository](https://github.com/searayeah/russian-sentiment-emotion-datasets).__
69
-
70
- ## Training
71
-
72
- Training were done in this [project](https://github.com/searayeah/bert-russian-sentiment-emotion) with this parameters:
73
-
74
- ```yaml
75
- tokenizer.max_length: 512
76
- batch_size: 64
77
- optimizer: adam
78
- lr: 0.00001
79
- weight_decay: 0
80
- epochs: 5
81
- ```
82
-
83
- Train/validation/test splits are 80%/10%/10%.
84
-
85
- ## Eval results (on test split)
86
-
87
-
88
- | |neutral|positive|negative|macro avg|weighted avg|
89
- |---------|-------|--------|--------|---------|------------|
90
- |precision|0.7 |0.84 |0.74 |0.76 |0.75 |
91
- |recall |0.74 |0.83 |0.69 |0.75 |0.75 |
92
- |f1-score |0.72 |0.83 |0.71 |0.75 |0.75 |
93
- |auc-roc |0.85 |0.95 |0.91 |0.9 |0.9 |
94
- |support |5196 |3831 |3599 |12626 |12626 |
 
27
  - KaggleRussianNews
28
  ---
29
 
 
30
  The task is a __multi-class classification__ with the following labels:
31
 
32
  ```yaml
 
50
  model = pipeline(model="seara/rubert-tiny2-russian-sentiment")
51
  model("Привет, ты мне нравишься!")
52
  # [{'label': 'positive', 'score': 0.9398769736289978}]
53
+ ```