--- license: mit language: - ru metrics: - f1 - roc_auc - precision - recall pipeline_tag: text-classification tags: - sentiment-analysis - multi-label-classification - sentiment analysis - rubert - sentiment - bert - russian - multilabel - classification - emotion-classification - emotion-recognition - emotion - emotion-detection datasets: - seara/ru_go_emotions --- This is [RuBERT](https://huggingface.co/DeepPavlov/rubert-base-cased) model fine-tuned for __emotion classification__ of short __Russian__ texts. The task is a __multi-label classification__ with the following labels: ```yaml 0: admiration 1: amusement 2: anger 3: annoyance 4: approval 5: caring 6: confusion 7: curiosity 8: desire 9: disappointment 10: disapproval 11: disgust 12: embarrassment 13: excitement 14: fear 15: gratitude 16: grief 17: joy 18: love 19: nervousness 20: optimism 21: pride 22: realization 23: relief 24: remorse 25: sadness 26: surprise 27: neutral ``` Label to Russian label: ```yaml admiration: восхищение amusement: веселье anger: злость annoyance: раздражение approval: одобрение caring: забота confusion: непонимание curiosity: любопытство desire: желание disappointment: разочарование disapproval: неодобрение disgust: отвращение embarrassment: смущение excitement: возбуждение fear: страх gratitude: признательность grief: горе joy: радость love: любовь nervousness: нервозность optimism: оптимизм pride: гордость realization: осознание relief: облегчение remorse: раскаяние sadness: грусть surprise: удивление neutral: нейтральность ``` ## Usage ```python from transformers import pipeline model = pipeline(model="seara/rubert-base-cased-ru-go-emotions") model("Привет, ты мне нравишься!") # [{'label': 'love', 'score': 0.5456761717796326}] ``` ## Dataset This model was trained on translated GoEmotions dataset called [ru_go_emotions](https://huggingface.co/datasets/seara/ru_go_emotions). An overview of the training data can be found on [Hugging Face card](https://huggingface.co/datasets/seara/ru_go_emotions) and on [Github repository](https://github.com/searayeah/ru-goemotions). ## Training Training were done in this [project](https://github.com/searayeah/bert-russian-sentiment-emotion) with this parameters: ```yaml tokenizer.max_length: null batch_size: 32 optimizer: adam lr: 0.00001 weight_decay: 0 num_epochs: 5 ``` ## Eval results (on test split) | |precision|recall|f1-score|auc-roc|support| |--------------|---------|------|--------|-------|-------| |admiration |0.66 |0.66 |0.66 |0.93 |504 | |amusement |0.79 |0.81 |0.8 |0.97 |264 | |anger |0.53 |0.3 |0.39 |0.91 |198 | |annoyance |0.0 |0.0 |0.0 |0.82 |320 | |approval |0.62 |0.25 |0.36 |0.82 |351 | |caring |0.69 |0.13 |0.22 |0.86 |135 | |confusion |0.56 |0.18 |0.28 |0.92 |153 | |curiosity |0.52 |0.4 |0.45 |0.95 |284 | |desire |0.67 |0.24 |0.35 |0.89 |83 | |disappointment|0.88 |0.05 |0.09 |0.82 |151 | |disapproval |0.56 |0.17 |0.26 |0.88 |267 | |disgust |0.83 |0.2 |0.33 |0.92 |123 | |embarrassment |0.0 |0.0 |0.0 |0.88 |37 | |excitement |0.78 |0.14 |0.23 |0.9 |103 | |fear |0.83 |0.37 |0.51 |0.92 |78 | |gratitude |0.94 |0.9 |0.92 |0.99 |352 | |grief |0.0 |0.0 |0.0 |0.72 |6 | |joy |0.7 |0.4 |0.51 |0.94 |161 | |love |0.77 |0.81 |0.79 |0.97 |238 | |nervousness |0.0 |0.0 |0.0 |0.85 |23 | |optimism |0.66 |0.52 |0.58 |0.92 |186 | |pride |0.0 |0.0 |0.0 |0.76 |16 | |realization |0.0 |0.0 |0.0 |0.74 |145 | |relief |0.0 |0.0 |0.0 |0.72 |11 | |remorse |0.58 |0.68 |0.63 |0.99 |56 | |sadness |0.58 |0.44 |0.5 |0.92 |156 | |surprise |0.62 |0.45 |0.52 |0.91 |141 | |neutral |0.72 |0.47 |0.57 |0.84 |1787 | |micro avg |0.7 |0.42 |0.53 |0.94 |6329 | |macro avg |0.52 |0.31 |0.36 |0.88 |6329 | |weighted avg |0.63 |0.42 |0.49 |0.88 |6329 |