ukraine-war-pov
This model is a fine-tuned version of xlm-roberta-base on a dataset of 30K social media posts (a balanced set of 15K for each label) from Ukraine manually annotated for pro-Ukrainian or pro-Russian point of view on the war after the 2022 invasion. It achieves the following results on a balanced test set (2K):
- Loss: 0.2166
- Accuracy: 0.9315
- F1: 0.9315
- Precision: 0.9315
- Recall: 0.9315
- AUC: 0.9774 (self-report)
Training and evaluation data
The training and evaluation data was compiled and labeled by the Center for Content Analysis in Ukraine: Artem Zakharchenko and his team, including Yevhen Luzan, Olena Zakharchenko, Olexiy Rogalyov, Olena Zinenko, Yuliia Maksymtsova, Maryna Fursenko, Valeriia Molotsiian, and Anhelika Machula.
Training procedure
The model was trained in this notebook.
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 16
- eval_batch_size: 64
- seed: 123
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 500
- num_epochs: 10
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
---|---|---|---|---|---|---|---|
0.284 | 1.0 | 1875 | 0.1850 | 0.9295 | 0.9295 | 0.9303 | 0.9295 |
0.2271 | 2.0 | 3750 | 0.1551 | 0.9405 | 0.9405 | 0.9414 | 0.9405 |
0.2064 | 3.0 | 5625 | 0.1734 | 0.9305 | 0.9305 | 0.9311 | 0.9305 |
0.1842 | 4.0 | 7500 | 0.1694 | 0.9315 | 0.9315 | 0.9317 | 0.9315 |
0.1628 | 5.0 | 9375 | 0.1838 | 0.9435 | 0.9435 | 0.9438 | 0.9435 |
0.1309 | 6.0 | 11250 | 0.2074 | 0.9395 | 0.9395 | 0.9395 | 0.9395 |
0.1017 | 7.0 | 13125 | 0.2659 | 0.9365 | 0.9365 | 0.9365 | 0.9365 |
0.0778 | 8.0 | 15000 | 0.2851 | 0.94 | 0.9400 | 0.9400 | 0.94 |
0.0664 | 9.0 | 16875 | 0.3238 | 0.9385 | 0.9385 | 0.9387 | 0.9385 |
0.066 | 10.0 | 18750 | 0.3092 | 0.939 | 0.9390 | 0.9390 | 0.9390 |
Framework versions
- Transformers 4.27.4
- Pytorch 2.0.0+cu118
- Tokenizers 0.13.3
- Downloads last month
- 16
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.