File size: 7,435 Bytes
2c9ba3e
b14bcd2
 
 
 
 
 
2c9ba3e
b14bcd2
 
 
 
 
 
2c9ba3e
 
b14bcd2
2c9ba3e
 
1ffdaf3
b14bcd2
2c9ba3e
 
 
 
b14bcd2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2c9ba3e
 
 
b14bcd2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2c9ba3e
b14bcd2
2c9ba3e
b14bcd2
 
 
2c9ba3e
 
 
b14bcd2
 
 
 
 
2c9ba3e
b14bcd2
2c9ba3e
b14bcd2
2c9ba3e
b14bcd2
2c9ba3e
b14bcd2
2c9ba3e
b14bcd2
 
 
 
 
 
2c9ba3e
 
 
 
 
b14bcd2
 
 
 
 
 
 
 
 
 
 
 
 
7271ba1
b14bcd2
7271ba1
 
b14bcd2
7271ba1
 
 
b14bcd2
 
 
7271ba1
 
 
 
b14bcd2
 
 
 
 
 
 
 
 
 
e7b6a4d
b14bcd2
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
---
license: cc-by-sa-4.0
language:
- en
metrics:
- accuracy
pipeline_tag: text-classification
tags:
- partypress
- political science
- parties
- press releases
widget:
 - text: 'Farmers who applied for a Force Majeure when their businesses wereimpacted by severe flooding and landslides on 22 and 23 August 2017 cannow apply for the one-off financial payment.“The extreme flooding event meant that the farming and wider rural communities in the North West experienced significant hardship.Farm businesses lost income due to the impact on their land and thecost of removing debris and silt, as well as reseeding to restore itback to productive use,” said Minister Poots.“So I am delighted to say that this North West 2017 Flooding Income Support Scheme, worth almost £2.7million, is now open to applications. This is a time limited scheme which will close on 12 August 2021. “The one-off grant payment, which will be capped at £106,323 per farm business, is available for farmers who applied for a Force Majeure in respect of the flooding incident.“I would urge all eligible businesses to make sure their application is submitted as soon as possible,” Minister Poots added.Eligible farm businesses will receive a letter inviting them to applyfor the support package, with instructions on how to access theapplication form and receive help to complete it.They must complete the application form available on DAERA OnlineServices from 28 July 2021. Explanatory information and guidance willalso be published on the DAERA website.Further information on the scheme can be found on the DAERA website www.daera-ni.gov.uk'
---

# PARTYPRESS monolingual UK


Fine-tuned model, based on [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english). Used in [Erfort et al. (2023)](https://doi.org/10.1177/20531680231183512), building on the PARTYPRESS database. For the downstream task of classyfing press releases from political parties into 23 unique policy areas we achieve a performance comparable to expert human coders.



## Model description

The PARTYPRESS monolingual model builds on [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english) but has a supervised component. This means, it was fine-tuned using texts labeled by humans. The labels indicate 23 different political issue categories derived from the Comparative Agendas Project (CAP):
| Code | Issue |
|--|-------|
| 1 | Macroeconomics |
| 2 | Civil Rights |
| 3 | Health |
| 4 | Agriculture |
| 5 | Labor |
| 6 | Education |
| 7 | Environment |
| 8 | Energy |
| 9 | Immigration |
| 10 | Transportation |
| 12 | Law and Crime |
| 13 | Social Welfare |
| 14 | Housing |
| 15 | Domestic Commerce |
| 16 | Defense |
| 17 | Technology |
| 18 | Foreign Trade |
| 19.1 | International Affairs |
| 19.2 | European Union |
| 20 | Government Operations |
| 23 | Culture |
| 98 | Non-thematic |
| 99 | Other |

## Model variations

There are several monolingual models for different countries, and a multilingual model. The multilingual model can be easily extended to other languages, country contexts, or time periods by fine-tuning it with minimal additional labeled texts.

## Intended uses & limitations

The main use of the model is for text classification of press releases from political parties. It may also be useful for other political texts.

The classification can then be used to measure which issues parties are discussing in their communication.

### How to use

This model can be used directly with a pipeline for text classification:

```python
>>> from transformers import pipeline
>>> tokenizer_kwargs = {'padding':True,'truncation':True,'max_length':512}
>>> partypress = pipeline("text-classification", model = "cornelius/partypress-monolingual-uk", tokenizer = "cornelius/partypress-monolingual-uk", **tokenizer_kwargs)
>>> partypress("Your text here.")
```

### Limitations and bias

The model was trained with data from parties in the UK. For use in other countries, the model may be further fine-tuned. Without further fine-tuning, the performance of the model may be lower.

The model may have biased predictions. We discuss some biases by country, party, and over time in the release paper for the PARTYPRESS database. For example, the performance is highest for press releases from UK (75%) and lowest for Poland (55%).

## Training data

The PARTYPRESS multilingual model was fine-tuned with about 3,000 press releases from parties in the UK. The press releases were labeled by two expert human coders.

For the training data of the underlying model, please refer to [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)

## Training procedure

### Preprocessing

For the preprocessing, please refer to [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)



### Pretraining

For the pretraining, please refer to [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english)

### Fine-tuning

We fine-tuned the model using about 3,000 labeled press releases from political parties in the UK. 

#### Training Hyperparameters

The batch size for training was 12, for testing 2, with four epochs. All other hyperparameters were the standard from the transformers library.


#### Framework versions

- Transformers 4.28.0
- TensorFlow 2.12.0
- Datasets 2.12.0
- Tokenizers 0.13.3


## Evaluation results

Fine-tuned on our downstream task, this model achieves the following results in a five-fold cross validation that are comparable to the performance of our expert human coders. Please refer to Erfort et al. (2023)

### BibTeX entry and citation info

```bibtex
@article{erfort_partypress_2023,
  author    = {Cornelius Erfort and
               Lukas F. Stoetzer and
               Heike Klüver},
  title     = {The PARTYPRESS Database: A new comparative database of parties’ press releases},
  journal   = {Research and Politics},
  volume    = {10},
  number    = {3},
  year      = {2023},
  doi       = {10.1177/20531680231183512},
  URL       = {https://doi.org/10.1177/20531680231183512}

}
```

Erfort, C., Stoetzer, L. F., & Klüver, H. (2023). The PARTYPRESS Database: A new comparative database of parties’ press releases. Research & Politics, 10(3). [https://doi.org/10.1177/20531680231183512](https://doi.org/10.1177/20531680231183512)



### Further resources

Github: [cornelius-erfort/partypress](https://github.com/cornelius-erfort/partypress)

Research and Politics Dataverse: [Replication Data for: The PARTYPRESS Database: A New Comparative Database of Parties’ Press Releases](https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi%3A10.7910%2FDVN%2FOINX7Q)



## Acknowledgements

Research for this contribution is part of the Cluster of Excellence "Contestations of the Liberal Script" (EXC 2055, Project-ID: 390715649), funded by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany's Excellence Strategy. Cornelius Erfort is moreover grateful for generous funding provided by the DFG through the Research Training Group DYNAMICS (GRK 2458/1).

## Contact

Cornelius Erfort

Humboldt-Universität zu Berlin

[corneliuserfort.de](corneliuserfort.de)