ukraine-war-pov

This model is a fine-tuned version of xlm-roberta-base on a dataset of 30K social media posts (a balanced set of 15K for each label) from Ukraine manually annotated for pro-Ukrainian or pro-Russian point of view on the war after the 2022 invasion. It achieves the following results on a balanced test set (2K):

Loss: 0.2166
Accuracy: 0.9315
F1: 0.9315
Precision: 0.9315
Recall: 0.9315
AUC: 0.9774 (self-report)

Training and evaluation data

The training and evaluation data was compiled and labeled by the Center for Content Analysis in Ukraine: Artem Zakharchenko and his team, including Yevhen Luzan, Olena Zakharchenko, Olexiy Rogalyov, Olena Zinenko, Yuliia Maksymtsova, Maryna Fursenko, Valeriia Molotsiian, and Anhelika Machula.

Training procedure

The model was trained in this notebook.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 5e-05
train_batch_size: 16
eval_batch_size: 64
seed: 123
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 500
num_epochs: 10

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	F1	Precision	Recall
0.284	1.0	1875	0.1850	0.9295	0.9295	0.9303	0.9295
0.2271	2.0	3750	0.1551	0.9405	0.9405	0.9414	0.9405
0.2064	3.0	5625	0.1734	0.9305	0.9305	0.9311	0.9305
0.1842	4.0	7500	0.1694	0.9315	0.9315	0.9317	0.9315
0.1628	5.0	9375	0.1838	0.9435	0.9435	0.9438	0.9435
0.1309	6.0	11250	0.2074	0.9395	0.9395	0.9395	0.9395
0.1017	7.0	13125	0.2659	0.9365	0.9365	0.9365	0.9365
0.0778	8.0	15000	0.2851	0.94	0.9400	0.9400	0.94
0.0664	9.0	16875	0.3238	0.9385	0.9385	0.9387	0.9385
0.066	10.0	18750	0.3092	0.939	0.9390	0.9390	0.9390

Framework versions

Transformers 4.27.4
Pytorch 2.0.0+cu118
Tokenizers 0.13.3

YaraKyrychenko
/

ukraine-war-pov