metadata
license: apache-2.0
language:
- nl
pipeline_tag: text-classification
Fine-tuned model for detecting instances of offensive language in Ducth tweets. The model has been trained with DALC v2.0 .
Offensive language defintion is inhereted from SemEval 2019 OffensEval: "Posts containing any form of non-acceptable language (profanity) or a targeted offence, which can be veiled or direct. This includes insults, threats, and posts containing profane language or swear words." (Zampieri et al., 2019)
The model achieve the following results on multiple test data:
- DALC held-out test set: macro F1: 79.93; F1 Offensive: 70.34
- HateCheck-NL (functional benchmark for hate speech): Accuracy: 61.40; Accuracy non-hateful tests: 47.61 ; Accuracy hateful tests: 68.86
- OP-NL (dynamyc benchmark for offensive language): macro F1: 73.56
More details on the training settings and pre-processind are available here