Edit model card

Text Classification GoEmotions

This model is a fine-tuned version of roberta-large on the go_emotions dataset.

Model description

At first, 4 epochs of training with a learning rate of 5e-5 was performed on the roberta-large model. After that, the weights were loaded in a new environment and another epoch of training was done (this time with a learning rate of 2e-5). As the performance decreased in the fifth epoch, further training was discontinued.

After the 4th epoch, the model achieved a macro-F1 score of 53% on the test set, but the fifth epoch reduced the performance. The model on commit "5b532728cef22ca9e9bacc8ff9f5687654d36bf3" attains the following scores on the test set:

  • Accuracy: 0.4271236410539893
  • Precision: 0.5101494353184485
  • Recall: 0.5763722014150806
  • macro-F1: 0.5297380709491947

Load this specific version of the model using the syntax below:

import os
from transformers import AutoTokenizer, AutoModelForSequenceClassification

os.environ["TOKENIZERS_PARALLELISM"] = "FALSE"

model_name = "tasinhoque/text-classification-goemotions"
commit = "5b532728cef22ca9e9bacc8ff9f5687654d36bf3"
tokenizer = AutoTokenizer.from_pretrained(model_name, revision=commit)

model = AutoModelForSequenceClassification.from_pretrained(
    model_name, 
    num_labels=n_emotion, 
    problem_type="multi_label_classification", 
    revision=commit
)

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05 (2e-5 in the 5th epoch)
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42 (only in the 5th epoch)
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 5

Training results

Training Loss Epoch Step Validation Loss Accuracy Precision Recall F1
No log 1.0 340 0.0884 0.3782 0.4798 0.4643 0.4499
0.1042 2.0 680 0.0829 0.4093 0.4766 0.5272 0.4879
0.1042 3.0 1020 0.0821 0.4202 0.5103 0.5531 0.5092
0.0686 4.0 1360 0.0830 0.4327 0.5160 0.5556 0.5226
No log 5.0 1700 0.0961 0.4521 0.5190 0.5359 0.5218

Framework versions

  • Transformers 4.20.1
  • Pytorch 1.12.0
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
23
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train tasinhoque/text-classification-goemotions

Evaluation results