Multilabel classification model trained on the Toxic Comments dataset from Kaggle. (https://www.kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge/data)

Fine tuned using DistilBert.

from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch

model = AutoModelForSequenceClassification.from_pretrained("pretrained_model")
tokenizer = AutoTokenizer.from_pretrained("model_tokenizer")

X_train = ["Why is Owen's retirement from football not mentioned? He hasn't played a game since 2005."]
batch = tokenizer(X_train, truncation=True, padding='max_length', return_tensors="pt")
labels = ["toxic", "severe_toxic", "obscene", "threat", "insult", "identity_hate"]

with torch.no_grad():
  outputs = model(**batch)
  predictions = torch.sigmoid(outputs.logits)*100
  probs = predictions[0].tolist()
  for i in range(len(probs)):
    print(f"{labels[i]}: {round(probs[i], 3)}%")

Output expected below:

toxic: 0.676%
severe_toxic: 0.001%
obscene: 0.098%
threat: 0.007%
insult: 0.021%
identity_hate: 0.004%
Downloads last month
9
Safetensors
Model size
67M params
Tensor type
F32
ยท
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Space using ac8736/toxic-tweets-fine-tuned-distilbert 1