metadata

license: mit
language:
  - am
  - ar
  - hy
  - eu
  - bn
  - bs
  - bg
  - my
  - hr
  - ca
  - cs
  - da
  - nl
  - en
  - et
  - fi
  - fr
  - ka
  - de
  - el
  - gu
  - ht
  - iw
  - hi
  - hu
  - is
  - in
  - it
  - ja
  - kn
  - km
  - ko
  - lo
  - lv
  - lt
  - ml
  - mr
  - ne
  - 'no'
  - or
  - pa
  - ps
  - fa
  - pl
  - pt
  - ro
  - ru
  - sr
  - zh
  - sd
  - si
  - sk
  - sl
  - es
  - sv
  - tl
  - ta
  - te
  - th
  - tr
  - uk
  - ur
  - ug
  - vi
  - cy
tags:
  - generated_from_trainer
model-index:
  - name: verdict-classifier-en
    results:
      - task:
          type: text-classification
          name: Verdict Classification
widget:
  - One might think that this is true, but it's taken out of context.

Multilingual Verdict Classifier

This model is a fine-tuned version of xlm-roberta-base on 1,500 deduplicated multilingual verdicts from Google Fact Check Tools API, translated into 65 languages with the Google Cloud Translation API. It achieves the following results on the evaluation set, being 1,000 such verdicts, but here including duplicates to represent the true distribution:

Loss: 0.1856
F1 Macro: 0.8148
F1 Misinformation: 0.9764
F1 Factual: 0.9375
F1 Other: 0.5306
Precision Macro: 0.8117
Precision Misinformation: 0.9775
Precision Factual: 0.9375
Precision Other: 0.52

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 4
eval_batch_size: 4
seed: 42
gradient_accumulation_steps: 8
total_train_batch_size: 32
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 30066
num_epochs: 1000

Training results

Training Loss	Epoch	Step	Validation Loss	F1 Macro	F1 Misinformation	F1 Factual	F1 Other	Precision Macro	Precision Misinformation	Precision Factual	Precision Other
0.8707	1.0	3758	0.2414	0.7832	0.9639	0.7857	0.6	0.7950	0.9683	0.9167	0.5
0.3918	2.0	7516	0.1856	0.8148	0.9764	0.9375	0.5306	0.8117	0.9775	0.9375	0.52
0.1766	3.0	11274	0.1942	0.8394	0.9809	0.9538	0.5833	0.8349	0.9820	0.9394	0.5833
0.1071	4.0	15032	0.2078	0.8676	0.9786	0.9841	0.64	0.8650	0.9797	1.0	0.6154

Framework versions

Transformers 4.11.3
Pytorch 1.9.0+cu102
Datasets 1.9.0
Tokenizers 0.10.2