Update README.md
Browse files
README.md
CHANGED
@@ -63,6 +63,11 @@ The second dataset is the human-annotated dataset that is used for training part
|
|
63 |
[More Information Needed]
|
64 |
|
65 |
## Evaluation
|
|
|
|
|
|
|
|
|
|
|
66 |
The following figure below displays the performance and compares it to two benchmarks ([scores as csv](https://huggingface.co/Sami92/XLM-R-Large-PartyPress/blob/main/scores.csv)). The first benchmark is the coder agreement of the two coders per country (for details, see [Erfort et al. (2023)](https://journals.sagepub.com/doi/10.1177/20531680231183512)). It is referred to as Coder F1 and the difference between the model performance and the coder agreement is referred to as Coder Difference. The model comes close to the agreement of human coders in almost all classes. One notable exception is Foreign Trade and to a lesser extent Defence and Law and Crime. The second benchmark are the results of partypress/partypress-multilingual, referred to as Party Press F1 and the difference to the present model is referred to as Party Press Difference. Except for Foreign Trade and Law and Crime, the present model is on par or stronger than the other Party Press Model. In total it achieves an F1 score that is .06 higher.
|
67 |
`![](./scores.png)
|
68 |
|
|
|
63 |
[More Information Needed]
|
64 |
|
65 |
## Evaluation
|
66 |
+
|
67 |
+
| Accuracy | Precision | Recall | F1 score |
|
68 |
+
|:--------:|:---------:|:-------:|:--------:|
|
69 |
+
| 0.72 | 0.72 | 0.72 | 0.72 |
|
70 |
+
|
71 |
The following figure below displays the performance and compares it to two benchmarks ([scores as csv](https://huggingface.co/Sami92/XLM-R-Large-PartyPress/blob/main/scores.csv)). The first benchmark is the coder agreement of the two coders per country (for details, see [Erfort et al. (2023)](https://journals.sagepub.com/doi/10.1177/20531680231183512)). It is referred to as Coder F1 and the difference between the model performance and the coder agreement is referred to as Coder Difference. The model comes close to the agreement of human coders in almost all classes. One notable exception is Foreign Trade and to a lesser extent Defence and Law and Crime. The second benchmark are the results of partypress/partypress-multilingual, referred to as Party Press F1 and the difference to the present model is referred to as Party Press Difference. Except for Foreign Trade and Law and Crime, the present model is on par or stronger than the other Party Press Model. In total it achieves an F1 score that is .06 higher.
|
72 |
`![](./scores.png)
|
73 |
|