GroNLP/bert_dutch_base_offensive_language

Fine-tuned model for detecting instances of offensive language in Dutch tweets. The model has been trained with DALC v2.0 .

Offensive language definition is inherited from SemEval 2019 OffensEval: "Posts containing any form of non-acceptable language (profanity) or a targeted offence, which can be veiled or direct. This includes insults, threats, and posts containing profane language or swear words." (Zampieri et al., 2019)

The model achieves the following results on multiple test data:

DALC held-out test set: macro F1: 79.93; F1 Offensive: 70.34
HateCheck-NL (functional benchmark for hate speech): Accuracy: 61.40; Accuracy non-hateful tests: 47.61 ; Accuracy hateful tests: 68.86
OP-NL (dynamic benchmark for offensive language): macro F1: 73.56

More details on the training settings and pre-processing are available here