Papers
arxiv:2301.04253

ClimaBench: A Benchmark Dataset For Climate Change Text Understanding in English

Published on Jan 11, 2023
Authors:
,

Abstract

The topic of Climate Change (CC) has received limited attention in NLP despite its real world urgency. Activists and policy-makers need NLP tools in order to effectively process the vast and rapidly growing textual data produced on CC. Their utility, however, primarily depends on whether the current state-of-the-art models can generalize across various tasks in the CC domain. In order to address this gap, we introduce Climate Change Benchmark (ClimaBench), a benchmark collection of existing disparate datasets for evaluating model performance across a diverse set of CC NLU tasks systematically. Further, we enhance the benchmark by releasing two large-scale labelled text classification and question-answering datasets curated from publicly available environmental disclosures. Lastly, we provide an analysis of several generic and CC-oriented models answering whether fine-tuning on domain text offers any improvements across these tasks. We hope this work provides a standard assessment tool for research on CC text data.

Community

Sign up or log in to comment

Models citing this paper 0

No model linking this paper

Cite arxiv.org/abs/2301.04253 in a model README.md to link it from this page.

Datasets citing this paper 1

Spaces citing this paper 0

No Space linking this paper

Cite arxiv.org/abs/2301.04253 in a Space README.md to link it from this page.

Collections including this paper 0

No Collection including this paper

Add this paper to a collection to link it from this page.