---
title: Hate Speech Text Classifier
emoji: 👁
colorFrom: green
colorTo: red
sdk: gradio
sdk_version: 4.21.0
app_file: app.py
pinned: false
---

# Monitoring Harmful Text in Online Platforms

## Overview
This repository hosts the RandomForest classifier model designed for detecting harmful text AGAINST GROUPS. 
The model classifies text into one of three categories: "Offensive or Hateful", "Neutral or Ambiguous", and "Not Hate". 
Achieving an accuracy of 92.5%, this model was developed through the combination of three distinct datasets, ensuring robustness and reliability in varied contexts. 
It was presented at the prestigious annual Gulf Coast Conference & Expo on AI.

Model Details
Model Type: RandomForest Classifier
Accuracy: 92.5%
Labels:
0: Neutral or Ambiguous
1: Not Hate
2: Offensive or Hateful
Training Data: Augmented version of [this dataset](TLeonidas/twitter-hate-speech-en-240ksamples) (279k+ rows)

## Dataset
The model was trained on a concatenated dataset from three distinct sources, curated to encompass a wide range of linguistic expressions and contexts. 
The datasets were preprocessed and balanced to ensure fair representation across all categories.

## Performance
The model achieves an accuracy of 92.5%, evaluated on a held-out test set.

## Contributing
We welcome contributions to improve the model and extend its applicability.