Multilabel classification model trained on the Toxic Comments dataset from Kaggle. (https://www.kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge/data)
Fine tuned using DistilBert.
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
model = AutoModelForSequenceClassification.from_pretrained("pretrained_model")
tokenizer = AutoTokenizer.from_pretrained("model_tokenizer")
X_train = ["Why is Owen's retirement from football not mentioned? He hasn't played a game since 2005."]
batch = tokenizer(X_train, truncation=True, padding='max_length', return_tensors="pt")
labels = ["toxic", "severe_toxic", "obscene", "threat", "insult", "identity_hate"]
with torch.no_grad():
outputs = model(**batch)
predictions = torch.sigmoid(outputs.logits)*100
probs = predictions[0].tolist()
for i in range(len(probs)):
print(f"{labels[i]}: {round(probs[i], 3)}%")
Output expected below:
toxic: 0.676%
severe_toxic: 0.001%
obscene: 0.098%
threat: 0.007%
insult: 0.021%
identity_hate: 0.004%
- Downloads last month
- 9
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.