MaxKazak's picture
Librarian Bot: Add base_model information to model (#2)
6121b9e
metadata
language:
  - ru
license: apache-2.0
tags:
  - sentiment
  - emotion-classification
  - multilabel
  - multiclass
datasets:
  - Djacon/ru_goemotions
metrics:
  - accuracy
widget:
  - text: Очень рад тебя видеть!
  - text: Как дела?
  - text: Мне немного отвратно это делать
  - text: Я испытал мурашки от страха
  - text: Нет ничего радостного в этих горьких новостях
  - text: Ого, неожидал тебя здесь увидеть!
  - text: Фу ну и мерзость
  - text: Мне неприятно общение с тобой
base_model: ai-forever/ruBert-base
model-index:
  - name: ruBert-base-russian-emotions-classifier-goEmotions
    results:
      - task:
          type: multilabel-text-classification
          name: Multilabel Text Classification
        dataset:
          name: ru_goemotions
          type: Djacon/ru_goemotions
          args: ru
        metrics:
          - type: roc_auc
            value: 92%
            name: multilabel ROC AUC

ruBert-base-russian-emotions-classifier-goEmotions

This model is a fine-tuned version of ai-forever/ruBert-base on Djacon/ru_goemotions. It achieves the following results on the evaluation set (2nd epoch):

  • Loss: 0.2088
  • AUC: 0.9240

The quality of the predicted probabilities on the test dataset is the following:

label joy interest surpise sadness anger disgust fear guilt neutral average
AUC 0.9369 0.9213 0.9325 0.8791 0.8374 0.9041 0.9470 0.9758 0.8518 0.9095
F1-micro 0.9528 0.9157 0.9697 0.9284 0.8690 0.9658 0.9851 0.9875 0.7654 0.9266
F1-macro 0.8369 0.7922 0.7561 0.7392 0.7351 0.7356 0.8176 0.8247 0.7650 0.7781

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Validation Loss AUC
0.1755 1.0 1685 0.1717 0.9220
0.1391 2.0 3370 0.1757 0.9240
0.0899 3.0 5055 0.2088 0.9106

Framework versions

  • Transformers 4.24.0
  • Pytorch 2.0.1
  • Datasets 2.12.0
  • Tokenizers 0.11.0