Edit model card

Identifying and Analysing political quotes from the Danish Parliament related to climate change using NLP

KlimaBERT, a sequence-classifier fine-tuned to predict whether political quotes are climate-related. When predicting the positive class 1, "climate-related", the model achieves a F1-score of 0.97, Precision of 0.97, and Recall of 0.97. The negative class, 0, is defined as "non-climate-related".

KlimaBERT is fine-tuned using the pre-trained DaBERT-uncased model, on a training set of 1.000 manually labelled data-points. The training set contains both political quotes and summaries of bills from the Danish Parliament.

The model is created to identify political quotes related to climate change, and performs best on official texts from the Danish Parliament.


To fine-tune a model similar to KlimaBERT, follow the fine-tuning notebooks


BERT: Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova (2018). Bert: Pre-training of deep bidirectional transformers for language understanding. https://arxiv.org/abs/1810.04805

DaBERT: Certainly (2021). Certainly has trained the most advanced danish bert model to date. https://www.certainly.io/blog/danish-bert-model/.


The resources are created through the work of my Master's thesis, so I would like to thank my supervisors Leon Derczynski and Vedran Sekara for the great support throughout the project! And a HUGE thanks to Gustav Gyrst for great sparring and co-development of the tools you find in this repo.


For any further help, questions, comments etc. feel free to contact the author Jonathan Kristensen on LinedIn or by creating a "discussion" on this model's page.

Downloads last month
Hosted inference API
This model can be loaded on the Inference API on-demand.