# Monolingual Dutch Models for Zero-Shot Text CLassification This family of Dutch models were finetuned on combined data from the (translated) snli (cite) and SICK-NL datasets (cite). They are intended to be used in zero-shot classification for Dutch through Huggingface Pipelines. ## The Models | Base Model | Huggingface id (fine-tuned) | |-------------------|---------------------| | [BERTje](https://huggingface.co/GroNLP/bert-base-dutch-cased) | this model | | [RobBERT V2](http://github.com/iPieter/robbert) | robbert-v2-dutch-snli | | [RobBERTje](https://github.com/iPieter/robbertje) | robbertje-dutch-nli | ## How to use While this family of models can be used for evaluating (monolingual) NLI datasets, it's primary intended use is zero-shot text classification in Dutch. In this setting, classification tasks are recast as NLI problems. Consider the following sentence pairing that can be used to simulate a sentiment classification problem: - Premise: The food in this place was horrendous - Hypothesis: This is a negative review For more information on using Natural Language Inference models for zero-shot text classification, we refer to this(link) paper. By default, all our models are fully compatible with the Huggingface pipeline for zero-shot classification. They can be downloaded and accessed through the following code: ```python from transformers import pipeline classifier = pipeline( task="zero-shot-classification", model='robbert-v2-dutch-base-snli' ) text_piece = "Het eten in dit restaurant is heel lekker." labels = ["positief", "negatief", "neutraal"] template = "Het sentiment van deze review is {}" predictions = classifier(text_piece, labels, multi_class=False, hypothesis_template=template ) ``` ## Model Performance ### Performance on NLI task | Model | Accuracy [%] | F1 [%] | |-------------------|--------------------------|--------------| | BERTje-nli | 92.157 | 90.898 | | RobBERT-v2-nli | 93.096 | 91.279 | | RobBERTje-nli | **97.816** | **97.514** | ## Credits and citation TBD