sana-ngu's picture
Update README.md
6bef03f
metadata
language:
  - en
metrics:
  - accuracy
  - f1
widget:
  - text: >-
      Every woman wants to be a model. It's codeword for 'I get everything for
      free and people want me'
pipeline_tag: text-classification

BERTweet-large-sexism-detector

This is a fine-tuned model of BERTweet-large on the Explainable Detection of Online Sexism (EDOS) dataset. It is intended to be used as a classification model for identifying tweets (0 - not sexist; 1 - sexist).

More information about the original pre-trained model can be found here

Our model accuracy was 89.72 using the test set and 86.13 F1-score.

Classification examples:

Prediction Tweet
sexist Every woman wants to be a model. It's codeword for "I get everything for free and people want me"
not sexist basically I placed more value on her than I should then?

More Details

For more details about the datasets and eval results, see (we will updated the page with our paper link)

How to use

from transformers import AutoModelForSequenceClassification, AutoTokenizer,pipeline
import torch
model = AutoModelForSequenceClassification.from_pretrained('NLP-LTU/bertweet-large-sexism-detector')
tokenizer = AutoTokenizer.from_pretrained('NLP-LTU/bertweet-large-sexism-detector') 
classifier = pipeline("text-classification", model=model, tokenizer=tokenizer)
prediction=classifier("Every woman wants to be a model. It's codeword for 'I get everything for free and people want me' ")
# label_pred = 'not sexist' if prediction == 0 else 'sexist' 

print(prediction)

our system rank 10 out of 84 teams, and our results on the test set was:

              precision    recall  f1-score   support

  not sexsit     0.9355    0.9284    0.9319      3030
      sexist     0.7815    0.8000    0.7906       970

    accuracy                         0.8972      4000
   macro avg     0.8585    0.8642    0.8613      4000
weighted avg     0.8981    0.8972    0.8977      4000

tn 2813, fp 217, fn 194, tp 776```