Edit model card
YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Monolingual Dutch Models for Zero-Shot Text Classification

This family of Dutch models were finetuned on combined data from the (translated) snli and SICK-NL datasets. They are intended to be used in zero-shot classification for Dutch through Huggingface Pipelines.

The Models

Base Model Huggingface id (fine-tuned)
BERTje this model
RobBERT V2 LoicDL/robbert-v2-dutch-finetuned-snli
RobBERTje LoicDL/robbertje-dutch-finetuned-snli

How to use

While this family of models can be used for evaluating (monolingual) NLI datasets, it's primary intended use is zero-shot text classification in Dutch. In this setting, classification tasks are recast as NLI problems. Consider the following sentence pairing that can be used to simulate a sentiment classification problem:

  • Premise: The food in this place was horrendous
  • Hypothesis: This is a negative review

For more information on using Natural Language Inference models for zero-shot text classification, we refer to this paper.

By default, all our models are fully compatible with the Huggingface pipeline for zero-shot classification. They can be downloaded and accessed through the following code:

from transformers import pipeline

classifier = pipeline(
                      task="zero-shot-classification",
                      model='LoicDL/bert-base-dutch-cased-finetuned-snli'
                    )


text_piece = "Het eten in dit restaurant is heel lekker."
labels = ["positief", "negatief", "neutraal"]
template = "Het sentiment van deze review is {}"

predictions = classifier(text_piece,
                         labels,
                         multi_class=False,
                         hypothesis_template=template
                         )

Model Performance

Performance on NLI task

Model Accuracy [%] F1 [%]
bert-base-dutch-cased-finetuned-snli 86.21 86.42
robbert-v2-dutch-finetuned-snli 87.61 88.02
robbertje-dutch-finetuned-snli 83.28 84.11

BibTeX entry and citation info

If you would like to use or cite our paper or model, feel free to use the following BibTeX code:

@article{De Langhe_Maladry_Vanroy_De Bruyne_Singh_Lefever_2024,
title={Benchmarking Zero-Shot Text Classification for Dutch},
volume={13},
url={https://www.clinjournal.org/clinj/article/view/172},
journal={Computational Linguistics in the Netherlands Journal},
author={De Langhe, Loic and Maladry, Aaron and Vanroy, Bram and De Bruyne, Luna and Singh, Pranaydeep and Lefever, Els and De Clercq, Orphée},
year={2024},
month={Mar.},
pages={63–90} }
Downloads last month
76
Safetensors
Model size
109M params
Tensor type
I64
·
F32
·
Inference API
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.