Edit model card

Hebrew Corpus

This corpus contains offensive language in Hebrew manually annotated. The data includes 15,881 tweets, labeled with one or more of five classes (abusive, hate, violence, pornographic, or non-offensive). The corpus is annonated manually by Arabic-Hebrew bilingual speakers.

https://arxiv.org/abs/2309.02724

Models

AlephBERT (https://huggingface.co/imvladikon/sentence-transformers-alephbert)

Github Repository

git clone https://github.com/SinaLab/OffensiveHebrew

You can download the data from the following GitGub link:

https://github.com/SinaLab/OffensiveHebrew/tree/main/data

Downloads last month
1

Space using SinaLab/Offensive-Hebrew 1

Collection including SinaLab/Offensive-Hebrew