RO-Offense
This model is a fine-tuned version of readerbench/RoBERT-base on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.8411
- Accuracy: 0.8232
- Precision: 0.8235
- Recall: 0.8210
- F1 Macro: 0.8207
- F1 Micro: 0.8232
- F1 Weighted: 0.8210
Output labels:
- LABEL_0 = No offensive language
- LABEL_1 = Profanity (no directed insults)
- LABEL_2 = Insults (directed offensive language, lower level of offensiveness)
- LABEL_3 = Abuse (directed hate speech, racial slurs, sexist speech, threat with violence, death wishes, ..)
Model description
Finetuned Romanian BERT model for offensive classification.
Trained on the RO-Offense Dataset
Intended uses & limitations
Offensive and Hate speech detection for Romanian Language
Training and evaluation data
Trained on the train split of RO-Offense Dataset
Evaluated on the test split of RO-Offense Dataset
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 4e-05
- train_batch_size: 64
- eval_batch_size: 128
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.2
- num_epochs: 10 (Early stop epoch 7, best epoch 4)
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | Precision | Recall | F1 Macro | F1 Micro | F1 Weighted |
---|---|---|---|---|---|---|---|---|---|
No log | 1.0 | 125 | 0.7789 | 0.7037 | 0.6825 | 0.7000 | 0.6873 | 0.7037 | 0.7132 |
No log | 2.0 | 250 | 0.5170 | 0.8006 | 0.8066 | 0.8016 | 0.7986 | 0.8006 | 0.7971 |
No log | 3.0 | 375 | 0.5139 | 0.8096 | 0.8168 | 0.8237 | 0.8120 | 0.8096 | 0.8047 |
0.6074 | 4.0 | 500 | 0.6180 | 0.8247 | 0.8251 | 0.8187 | 0.8210 | 0.8247 | 0.8233 |
0.6074 | 5.0 | 625 | 0.7311 | 0.8096 | 0.8071 | 0.8085 | 0.8064 | 0.8096 | 0.8071 |
0.6074 | 6.0 | 750 | 0.8365 | 0.8101 | 0.8117 | 0.8191 | 0.8105 | 0.8101 | 0.8051 |
0.6074 | 7.0 | 875 | 0.8411 | 0.8232 | 0.8235 | 0.8210 | 0.8207 | 0.8232 | 0.8210 |
Framework versions
- Transformers 4.31.0
- Pytorch 2.0.1+cu118
- Datasets 2.14.3
- Tokenizers 0.13.3
- Downloads last month
- 13
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for readerbench/ro-offense
Base model
readerbench/RoBERT-baseEvaluation results
- Accuracy on Rommanian Offensive Language Datasettest set self-reported0.819
- Precision on Rommanian Offensive Language Datasettest set self-reported0.814
- Recall on Rommanian Offensive Language Datasettest set self-reported0.812
- Weighted F1 on Rommanian Offensive Language Datasettest set self-reported0.819
- Macro F1 on Rommanian Offensive Language Datasettest set self-reported0.819
- Macro F1 on Rommanian Offensive Language Datasettest set self-reported0.813