tasinhoque's picture
Update README.md
ba8ad0c
---
license: mit
tags:
- generated_from_trainer
datasets:
- go_emotions
metrics:
- f1
model-index:
- name: text-classification-goemotions
results:
- task:
name: Text Classification
type: text-classification
dataset:
name: go_emotions
type: multilabel_classification
config: simplified
split: test
args: simplified
metrics:
- name: F1
type: f1
value: 0.5072
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# Text Classification GoEmotions
This model is a fine-tuned version of [roberta-large](https://huggingface.co/roberta-large) on the [go_emotions](https://huggingface.co/datasets/go_emotions) dataset.
## Model description
At first, 4 epochs of training with a learning rate of 5e-5 was performed on the `roberta-large` model.
After that, the weights were loaded in a new environment and another epoch of training was done (this time with a learning rate of 2e-5).
As the performance decreased in the fifth epoch, further training was discontinued.
After the 4th epoch, the model achieved a macro-F1 score of 53% on the test set, but the fifth epoch reduced the performance.
The model on commit "5b532728cef22ca9e9bacc8ff9f5687654d36bf3" attains the following scores on the test set:
- Accuracy: 0.4271236410539893
- Precision: 0.5101494353184485
- Recall: 0.5763722014150806
- macro-F1: 0.5297380709491947
Load this specific version of the model using the syntax below:
```py
import os
from transformers import AutoTokenizer, AutoModelForSequenceClassification
os.environ["TOKENIZERS_PARALLELISM"] = "FALSE"
model_name = "tasinhoque/text-classification-goemotions"
commit = "5b532728cef22ca9e9bacc8ff9f5687654d36bf3"
tokenizer = AutoTokenizer.from_pretrained(model_name, revision=commit)
model = AutoModelForSequenceClassification.from_pretrained(
model_name,
num_labels=n_emotion,
problem_type="multi_label_classification",
revision=commit
)
```
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05 (2e-5 in the 5th epoch)
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42 (only in the 5th epoch)
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
### Training results
| Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 |
|:-------------:|:-----:|:----:|:---------------:|:--------:|:---------:|:------:|:------:|
| No log | 1.0 | 340 | 0.0884 | 0.3782 | 0.4798 | 0.4643 | 0.4499 |
| 0.1042 | 2.0 | 680 | 0.0829 | 0.4093 | 0.4766 | 0.5272 | 0.4879 |
| 0.1042 | 3.0 | 1020 | 0.0821 | 0.4202 | 0.5103 | 0.5531 | 0.5092 |
| 0.0686 | 4.0 | 1360 | 0.0830 | 0.4327 | 0.5160 | 0.5556 | 0.5226 |
| No log | 5.0 | 1700 | 0.0961 | 0.4521 | 0.5190 | 0.5359 | 0.5218 |
### Framework versions
- Transformers 4.20.1
- Pytorch 1.12.0
- Datasets 2.1.0
- Tokenizers 0.12.1