RoBERTa | Hate detection

image/jpeg

Overview

This is a finetuned version of RoBERTa base, finetuned for detecting extreme hate (i.e extreme racism, homophobia, and suicide incitement)

This model is actively being improved and used in the AetherJr Discord bot.

Training Info

This model was trained on 121753 messages for 3 epochs using an NVIDIA RTX 3080 (10gb).

  • learning_rate: 3e-5

  • train_batch_size: 8

  • F1: 0.9423660

  • Recall: 0.9521694

  • Total Offensive/Bad: 1456

  • Total Normal: 120299

Dataset used

This model was trained on a collection of Discord messages obtained from various public chats across multiple servers, with the explicit permission of the server owners. The dataset was chosen for its relevance to the model's objective of detecting extreme racism, homophobia, and suicide incitement in online communications. To ensure privacy, all data has been anonymized, with personal identifiers and links removed.

The dataset used for this model will not be publicly released.

Limitations and Bias

The dataset, primarily from English-speaking servers with intense hate content, may limit the model's ability to effectively recognize and understand non-English hate expressions.

Also, the hate data consists of mostly slurs, short 2-3 sentence hate and is highly directed at black/gay people, as such, it may not recognize more generic/subtle forms of hate.

Inference

The simplest way to use this model is using pipelines
โš ๏ธ Make sure to pre-process your input text, removing links and ids as the model has a maximum input length of 512, you could also split the input into 512 long segments and do some math to calculate the score. It's up to you.

import torch
from transformers import pipeline

device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
classifier = pipeline(task="text-classification", model="imaether/roberta-offensive-junior", device=device) # add `top_k=None` if you want all scores to be returned

in_text = "this is a very cool and real example!"

model_output = classifier(in_text)
print(model_outputs)
# Output: [{'label': 'normal', 'score': 0.9999141693115234}]
Downloads last month
9
Safetensors
Model size
125M params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support