File size: 1,826 Bytes

9adcadd

---
license: afl-3.0
language:
- en
- te
metrics:
- accuracy
pipeline_tag: text-classification
library_name: transformers
tags:
- toxic-comment-classification
- roberta
- text-classification
---
# Toxic Comment Classification Using RoBERTa

## Overview

This project provides a toxic comment classification model based on RoBERTa (Robustly optimized BERT approach). The model is designed to classify comments as toxic or non-toxic, helping in moderating online discussions and improving community interactions.

## Model Details

- **Model Name**: RoBERTa for Toxic Comment Classification
- **Architecture**: RoBERTa
- **Fine-tuning Task**: Binary classification (toxic vs. non-toxic)
- **Evaluation Metrics**:
  - Accuracy
  - F1 Score
  - Precision
  - Recall

## Files

- `pytorch_model.bin`: The trained model weights.
- `config.json`: Model configuration file.
- `merges.txt`: BPE tokenizer merges file.
- `model.safetensors`: Model weights in safetensors format.
- `special_tokens_map.json`: Tokenizer special tokens mapping.
- `tokenizer_config.json`: Tokenizer configuration file.
- `vocab.json`: Tokenizer vocabulary file.
- `roberta-toxic-comment-classifier.pkl`: Serialized best model state dictionary (for PyTorch).
- `README.md`: This documentation file.

## Model Performance

- **Accuracy**: 0.9599
- **F1 Score**: 0.9615
- **Precision**: 0.9646
- **Recall**: 0.9599

## Load the model
```
from transformers import pipeline

# Load the model and tokenizer
model_name = "prabhaskenche/pk-toxic-comment-classification-using-RoBERTa"
classifier = pipeline("text-classification", model=model_name)

# Example usage
text = "You're the worst person I've ever met."
result = classifier(text)
print(result)

```

## Usage

### Installation

Install the required packages:

```bash
pip install torch transformers sklearn