PopBERT

PopBERT is a model for German-language populism detection in political speeches within the German Bundestag, based on the deepset/gbert-large model: https://huggingface.co/deepset/gbert-large

It is a multilabel model trained on a manually curated dataset of sentences from the 18th and 19th legislative periods. In addition to capturing the foundational dimensions of populism, namely "anti-elitism" and "people-centrism," the model was also fine-tuned to identify the underlying ideological orientation as either "left-wing" or "right-wing."

Prediction

The model outputs a Tensor of length 4. The table connects the position of the predicted probability to its dimension.

Index Dimension
0 Anti-Elitism
1 People-Centrism
2 Left-Wing Host-Ideology
3 Right-Wing Host-Ideology

Usage Example

import torch
from transformers import AutoModelForSequenceClassification
from transformers import AutoTokenizer

# load tokenizer
tokenizer = AutoTokenizer.from_pretrained("luerhard/PopBERT")

# load model
model = AutoModelForSequenceClassification.from_pretrained("luerhard/PopBERT")

# define text to be predicted
text = (
    "Das ist Klassenkampf von oben, das ist Klassenkampf im Interesse von "
    "Vermögenden und Besitzenden gegen die Mehrheit der Steuerzahlerinnen und "
    "Steuerzahler auf dieser Erde."
)

# encode text with tokenizer
encodings = tokenizer(text, return_tensors="pt")

# predict
with torch.inference_mode():
    out = model(**encodings)

# get probabilties
probs = torch.nn.functional.sigmoid(out.logits)
print(probs.detach().numpy())
[[0.8765146  0.34838045 0.983123   0.02148379]]

Performance

To maximize performance, it is recommended to use the following thresholds per dimension:

[0.415961, 0.295400, 0.429109, 0.302714]

Using these thresholds, the model achieves the following performance on the test set:

Dimension Precision Recall F1
Anti-Elitism 0.81 0.88 0.84
People-Centrism 0.70 0.73 0.71
Left-Wing Ideology 0.69 0.77 0.73
Right-Wing Ideology 0.68 0.66 0.67
--- --- --- ---
micro avg 0.75 0.80 0.77
macro avg 0.72 0.76 0.74
Downloads last month
56
Safetensors
Model size
336M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.