Sami92 commited on
Commit
47090c7
1 Parent(s): 8b9d06f

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -63,6 +63,11 @@ The second dataset is the human-annotated dataset that is used for training part
63
  [More Information Needed]
64
 
65
  ## Evaluation
 
 
 
 
 
66
  The following figure below displays the performance and compares it to two benchmarks ([scores as csv](https://huggingface.co/Sami92/XLM-R-Large-PartyPress/blob/main/scores.csv)). The first benchmark is the coder agreement of the two coders per country (for details, see [Erfort et al. (2023)](https://journals.sagepub.com/doi/10.1177/20531680231183512)). It is referred to as Coder F1 and the difference between the model performance and the coder agreement is referred to as Coder Difference. The model comes close to the agreement of human coders in almost all classes. One notable exception is Foreign Trade and to a lesser extent Defence and Law and Crime. The second benchmark are the results of partypress/partypress-multilingual, referred to as Party Press F1 and the difference to the present model is referred to as Party Press Difference. Except for Foreign Trade and Law and Crime, the present model is on par or stronger than the other Party Press Model. In total it achieves an F1 score that is .06 higher.
67
  `![](./scores.png)
68
 
 
63
  [More Information Needed]
64
 
65
  ## Evaluation
66
+
67
+ | Accuracy | Precision | Recall | F1 score |
68
+ |:--------:|:---------:|:-------:|:--------:|
69
+ | 0.72 | 0.72 | 0.72 | 0.72 |
70
+
71
  The following figure below displays the performance and compares it to two benchmarks ([scores as csv](https://huggingface.co/Sami92/XLM-R-Large-PartyPress/blob/main/scores.csv)). The first benchmark is the coder agreement of the two coders per country (for details, see [Erfort et al. (2023)](https://journals.sagepub.com/doi/10.1177/20531680231183512)). It is referred to as Coder F1 and the difference between the model performance and the coder agreement is referred to as Coder Difference. The model comes close to the agreement of human coders in almost all classes. One notable exception is Foreign Trade and to a lesser extent Defence and Law and Crime. The second benchmark are the results of partypress/partypress-multilingual, referred to as Party Press F1 and the difference to the present model is referred to as Party Press Difference. Except for Foreign Trade and Law and Crime, the present model is on par or stronger than the other Party Press Model. In total it achieves an F1 score that is .06 higher.
72
  `![](./scores.png)
73