distilbert-base-multilingual-cased-finetuned

This model is a fine-tuned version of distilbert-base-multilingual-cased on the Emotone Arabic dataset, which includes tweets labeled for various emotions: none, anger, joy, sadness, love, sympathy, surprise, and fear. It achieves the following results on the evaluation set:

Loss: 1.3099
Accuracy: 0.6632
F1: 0.6647

Model description

This model is designed for emotion recognition in Arabic text. It can classify tweets into one of the eight emotional categories.

Intended uses & limitations

This model is intended for applications in sentiment analysis and emotion detection in Arabic tweets. It may not perform well on texts outside the domain of social media or on languages other than Arabic.

Training and evaluation data

The model was fine-tuned on the Emotone Arabic dataset, which consists of tweets labeled with the following emotions:

none
anger
joy
sadness
love
sympathy
surprise
fear

Label Mapping

Label Name	Numeric Label
none	0
anger	1
joy	2
sadness	3
love	4
sympathy	5
surprise	6
fear	7

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 32
eval_batch_size: 32
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1
1.0026	1.0	252	1.0417	0.6408	0.6321
0.8422	2.0	504	1.0355	0.6508	0.6425
0.7114	3.0	756	1.0611	0.6364	0.6342
0.5709	4.0	1008	1.0672	0.6692	0.6665
0.459	5.0	1260	1.1167	0.6731	0.6693
0.3694	6.0	1512	1.1709	0.6637	0.6672
0.2975	7.0	1764	1.2094	0.6716	0.6699
0.2402	8.0	2016	1.2777	0.6642	0.6633
0.209	9.0	2268	1.2997	0.6692	0.6685
0.1792	10.0	2520	1.3099	0.6632	0.6647

Example Outputs

Here are some example inputs and their corresponding model predictions:

Input Tweet	Predicted Emotion	Numeric Label
"أنا سعيد جدًا اليوم!"	joy	2
"هذا أمر محبط حقًا."	sadness	3
"لا أستطيع تحمل هذا بعد الآن."	anger	1
"أحب كل من يدعمني."	love	4

Framework versions

Transformers 4.44.2
Pytorch 2.4.1+cu121
Datasets 3.0.0
Tokenizers 0.19.1

0marr
/

distilbert-base-multilingual-cased-finetuned