prabhaskenche's picture
Rename README (1).md to README.md
936ead4 verified
metadata
license: afl-3.0
language:
  - en
  - te
metrics:
  - accuracy
pipeline_tag: text-classification
library_name: transformers
tags:
  - toxic-comment-classification
  - roberta
  - text-classification

Toxic Comment Classification Using RoBERTa

Overview

This project provides a toxic comment classification model based on RoBERTa (Robustly optimized BERT approach). The model is designed to classify comments as toxic or non-toxic, helping in moderating online discussions and improving community interactions.

Model Details

  • Model Name: RoBERTa for Toxic Comment Classification
  • Architecture: RoBERTa
  • Fine-tuning Task: Binary classification (toxic vs. non-toxic)
  • Evaluation Metrics:
    • Accuracy
    • F1 Score
    • Precision
    • Recall

Files

  • pytorch_model.bin: The trained model weights.
  • config.json: Model configuration file.
  • merges.txt: BPE tokenizer merges file.
  • model.safetensors: Model weights in safetensors format.
  • special_tokens_map.json: Tokenizer special tokens mapping.
  • tokenizer_config.json: Tokenizer configuration file.
  • vocab.json: Tokenizer vocabulary file.
  • roberta-toxic-comment-classifier.pkl: Serialized best model state dictionary (for PyTorch).
  • README.md: This documentation file.

Model Performance

  • Accuracy: 0.9599
  • F1 Score: 0.9615
  • Precision: 0.9646
  • Recall: 0.9599

Load the model

from transformers import pipeline

# Load the model and tokenizer
model_name = "prabhaskenche/pk-toxic-comment-classification-using-RoBERTa"
classifier = pipeline("text-classification", model=model_name)

# Example usage
text = "You're the worst person I've ever met."
result = classifier(text)
print(result)

Usage

Installation

Install the required packages:

pip install torch transformers sklearn