christinacdl
/

XLM_RoBERTa-Multilingual-Clickbait-Detection

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

Edit model card

XLM_RoBERTa-Multilingual-Clickbait-Detection

This model is a fine-tuned version of xlm-roberta-large on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.2192
Micro F1: 0.9759
Macro F1: 0.9758
Accuracy: 0.9759

Test Set Macro-F1 scores

Multilingual test set: 97.28
en test set: 97.83
el test set: 97.32
it test set: 97.54
es test set: 97.67
ro test set: 97.40
de test set: 97.40
fr test set: 96.90
pl test set: 96.18

Intended uses & limitations

This model will be employed for an EU project.

Training and evaluation data

The "clickbait_detection_dataset" was translated from English to Greek, Italian, Spanish, Romanian, French and German using the Opus-mt.
The dataset was also translated from English to Polish using the M2M NMT.
The "EasyNMT" library was utilized to employ the NMT models.

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
gradient_accumulation_steps: 2
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 4

Framework versions

Transformers 4.36.1
Pytorch 2.1.0+cu121
Datasets 2.13.1
Tokenizers 0.15.0

Downloads last month: 4

Safetensors

Model size

560M params

Tensor type

F32

·

Finetuned from

Dataset used to train christinacdl/XLM_RoBERTa-Multilingual-Clickbait-Detection

Evaluation results

Metadata error: specify a dataset to view leaderboard