|
--- |
|
license: mit |
|
tags: |
|
- generated_from_trainer |
|
metrics: |
|
- accuracy |
|
- f1 |
|
- precision |
|
- recall |
|
model-index: |
|
- name: ukraine-war-pov |
|
results: [] |
|
widget: |
|
- text: Росія знову скоює воєнні злочини |
|
example_title: proukrainian |
|
- text: ВСУ все берет с собой — украинские «захистники» взяли стульчак из Артемовска |
|
example_title: prorussian |
|
language: |
|
- uk |
|
- ru |
|
--- |
|
|
|
# ukraine-war-pov |
|
|
|
This model is a fine-tuned version of [xlm-roberta-base](https://huggingface.co/xlm-roberta-base) on a dataset of 30K social media posts (a balanced set of 15K for each label) from Ukraine manually annotated for pro-Ukrainian or pro-Russian point of view on the war after the 2022 invasion. |
|
It achieves the following results on a balanced test set (2K): |
|
- Loss: 0.2166 |
|
- Accuracy: 0.9315 |
|
- F1: 0.9315 |
|
- Precision: 0.9315 |
|
- Recall: 0.9315 |
|
- AUC: 0.9774 (self-report) |
|
|
|
## Training and evaluation data |
|
|
|
The training and evaluation data was compiled and labeled by the Center for Content Analysis in Ukraine: Artem Zakharchenko and his team, including Yevhen Luzan, Olena Zakharchenko, Olexiy Rogalyov, Olena Zinenko, Yuliia Maksymtsova, Maryna Fursenko, Valeriia Molotsiian, and Anhelika Machula. |
|
|
|
## Training procedure |
|
|
|
The model was trained in this [notebook](https://drive.google.com/file/d/1RnT3fJTneFSczS_G_JLVqe4MydkTFiO0/view?usp=sharing). |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- learning_rate: 5e-05 |
|
- train_batch_size: 16 |
|
- eval_batch_size: 64 |
|
- seed: 123 |
|
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- lr_scheduler_warmup_steps: 500 |
|
- num_epochs: 10 |
|
|
|
### Training results |
|
|
|
| Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall | |
|
|:-------------:|:-----:|:-----:|:---------------:|:--------:|:------:|:---------:|:------:| |
|
| 0.284 | 1.0 | 1875 | 0.1850 | 0.9295 | 0.9295 | 0.9303 | 0.9295 | |
|
| 0.2271 | 2.0 | 3750 | 0.1551 | 0.9405 | 0.9405 | 0.9414 | 0.9405 | |
|
| 0.2064 | 3.0 | 5625 | 0.1734 | 0.9305 | 0.9305 | 0.9311 | 0.9305 | |
|
| 0.1842 | 4.0 | 7500 | 0.1694 | 0.9315 | 0.9315 | 0.9317 | 0.9315 | |
|
| 0.1628 | 5.0 | 9375 | 0.1838 | 0.9435 | 0.9435 | 0.9438 | 0.9435 | |
|
| 0.1309 | 6.0 | 11250 | 0.2074 | 0.9395 | 0.9395 | 0.9395 | 0.9395 | |
|
| 0.1017 | 7.0 | 13125 | 0.2659 | 0.9365 | 0.9365 | 0.9365 | 0.9365 | |
|
| 0.0778 | 8.0 | 15000 | 0.2851 | 0.94 | 0.9400 | 0.9400 | 0.94 | |
|
| 0.0664 | 9.0 | 16875 | 0.3238 | 0.9385 | 0.9385 | 0.9387 | 0.9385 | |
|
| 0.066 | 10.0 | 18750 | 0.3092 | 0.939 | 0.9390 | 0.9390 | 0.9390 | |
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.27.4 |
|
- Pytorch 2.0.0+cu118 |
|
- Tokenizers 0.13.3 |