metadata

license: creativeml-openrail-m
language: en
tags:
  - distilroberta
  - sentiment
  - NSFW
  - inappropriate
  - spam
  - twitter
  - reddit
widget:
  - text: I like you. You remind me of me when I was young and stupid.
  - text: I see you’ve set aside this special time to humiliate yourself in public.
  - text: Have a great weekend! See you next week!

Fine-tuned DistilBERT for NSFW Inappropriate Text Classification

Model Description

DistilBERT is a transformer model that performs sentiment analysis. I fine-tuned the model on Reddit posts with the purpose of classifying not safe for work (NSFW) content, specifically text that is considered inappropriate and unprofessional. The model predicts 2 classes, which are NSFW or safe for work (SFW).

The model is a fine-tuned version of DistilBERT.

It was fine-tuned on 19604 Reddit posts pulled from the [Comprehensive Abusiveness Detection Dataset] (https://aclanthology.org/2021.conll-1.43/).

How to Use

from transformers import pipeline
classifier = pipeline("sentiment-analysis", model="michellejieli/inappropriate_text_classifier")
classifier("I see you’ve set aside this special time to humiliate yourself in public.")

Output:
[{'label': 'NSFW', 'score': 0.9684491753578186}]

Contact

Please reach out to michelle.li851@duke.edu if you have any questions or feedback.

Reference

Hoyun Song, Soo Hyun Ryu, Huije Lee, and Jong Park. 2021. A Large-scale Comprehensive Abusiveness Detection Dataset with Multifaceted Labels from Reddit. In Proceedings of the 25th Conference on Computational Natural Language Learning, pages 552–561, Online. Association for Computational Linguistics.