metadata

language:
  - en
license: mit
datasets:
  - cardiffnlp/x_sensitive
metrics:
  - f1
widget:
  - text: Call me today to earn some money mofos!
pipeline_tag: text-classification

twitter-roberta-base-sensitive-binary

This is a RoBERTa-large model trained on 154M tweets until the end of December 2022 and finetuned for detecting sensitive content (multilabel classification) on the X-Sensitive dataset. The original Twitter-based RoBERTa model can be found here.

Labels

"id2label": {
  "0": "conflictual",
  "1": "profanity",
  "2": "sex",
  "3": "drugs",
  "4": "selfharm",
  "5": "spam"
}

Full classification example

from transformers import pipeline
    
pipe = pipeline(model='cardiffnlp/twitter-roberta-large-sensitive-multilabel')
text = "Call me today to earn some money mofos!"

pipe(text)

Output:

[[{'label': 'conflictual', 'score': 0.004052792210131884},
  {'label': 'profanity', 'score': 0.9994163513183594},
  {'label': 'sex', 'score': 0.0066294302232563496},
  {'label': 'drugs', 'score': 0.0027938704006373882},
  {'label': 'selfharm', 'score': 0.002117963507771492},
  {'label': 'spam', 'score': 0.992584228515625}]]

BibTeX entry and citation info

TBA