partypress
/

partypress-multilingual

Text Classification

political science

Inference Endpoints

Model card Files Files and versions Community

cornelius commited on May 26, 2023

Commit

4e8c48d

·

1 Parent(s): 7760b92

Update README.md

Files changed (1) hide show

README.md +25 -7

README.md CHANGED Viewed

@@ -27,27 +27,39 @@ Fine-tuned model in seven languages on texts from nine countries, based on [bert
 ## Model description
-tbs
 ## Model variations
-tbd (monolingual)
 ## Intended uses & limitations
-tbd
 ### How to use
-tbd
 ### Limitations and bias
-tbd
 ## Training data
-For the training data, please refer to [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased)
 ## Training procedure
@@ -59,10 +71,16 @@ For the preprocessing, please refer to [bert-base-multilingual-cased](https://hu
 For the pretraining, please refer to [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased)
 ## Evaluation results
-Fine-tuned on our downstream task, this model achieves the following results:
 ### BibTeX entry and citation info

 ## Model description
+The PARTYPRESS multilingual model builds on [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased) but has a supervised component. This means, it was fine-tuned using texts labeled by humans. The labels indicate 23 different political issue categories derived from the Comparative Agendas Project (CAP).
 ## Model variations
+We plan to release monolingual models for each of the languages covered by this multilingual model.
 ## Intended uses & limitations
+The main use of the model is for text classification of press releases from political parties. It may also be useful for other political texts.
 ### How to use
+This model can be used directly with a pipeline for text classification:
+```python
+>>> from transformers import pipeline
+>>> partypress = pipeline("text-classification", model = "cornelius/partypress-multilingual", tokenizer = "cornelius/partypress-multilingual")
+>>> partypress("We urgently need to fight climate change and reduce carbon emissions. This is what our party stands for.")
+```
 ### Limitations and bias
+The model was trained with data from parties in nine countries. For use in other countries, the model may be further fine-tuned. Without further fine-tuning, the performance of the model may be lower.
+The model may have biased predictions. We discuss some biases by country, party, and over time in the release paper for the PARTYPRESS database.
 ## Training data
+The PARTYPRESS multilingual model was fine-tuned with 27,243 press releases in seven languages on texts from 68 European parties in nine countries. The press releases were labeled by two expert human coders per country.
+For the training data of the underlying model, please refer to [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased)
 ## Training procedure
 For the pretraining, please refer to [bert-base-multilingual-cased](https://huggingface.co/bert-base-multilingual-cased)
+### Fine-tuning
 ## Evaluation results
+Fine-tuned on our downstream task, this model achieves the following results in a five-fold cross validation:
+| Accuracy | Precision | Recall  | F1 score |
+|:--------:|:---------:|:-------:|:--------:|
+|    69.52 |   67.99   | 67.60   |   66.77  |
 ### BibTeX entry and citation info