metadata

license: apache-2.0
tags:
  - null
datasets:
  - EXIST Dataset
  - MeTwo Machismo and Sexism Twitter Identification dataset
metrics:
  - accuracy
model-index:
  - name: twitter_sexismo-finetuned-exist2021
    results:
      - task:
          name: Text Classification
          type: text-classification
        dataset:
          name: EXIST Dataset
          type: EXIST Dataset
          args: es
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.83

twitter_sexismo-finetuned-exist2021

This model is a fine-tuned version of pysentimiento/robertuito-hate-speech on the EXIST dataset and MeTwo: Machismo and Sexism Twitter Identification dataset https://github.com/franciscorodriguez92/MeTwo. It achieves the following results on the evaluation set:

Loss: 0.54
Accuracy: 0.83

Model description

Model for the 'Somos NLP' Hackathon for detecting sexism in twitters in Spanish. Created by:

medardodt
MariaIsabel
ManRo
lucel172
robertou2

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

my_learning_rate = 5E-5
my_adam_epsilon = 1E-8
my_number_of_epochs = 8
my_warmup = 3
my_mini_batch_size = 32
optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 8

Training results

Epoch	Training Loss	Validation Loss	Accuracy	F1	Precision	Precision
1	0.389900	0.397857	0.827133	0.699620	0.786325	0.630137
2	0.064400	0.544625	0.831510	0.707224	0.794872	0.636986
3	0.004800	0.837723	0.818381	0.704626	0.733333	0.678082
4	0.000500	1.045066	0.820569	0.702899	0.746154	0.664384
5	0.000200	1.172727	0.805252	0.669145	0.731707	0.616438
6	0.000200	1.202422	0.827133	0.720848	0.744526	0.698630
7	0.000000	1.195012	0.827133	0.718861	0.748148	0.691781
8	0.000100	1.215515	0.824945	0.705882	0.761905	0.657534
9	0.000100	1.233099	0.827133	0.710623	0.763780	0.664384
10	0.000100	1.237268	0.829322	0.713235	0.769841	0.664384

Framework versions

Transformers 4.17.0
Pytorch 1.10.0+cu111
Tokenizers 0.11.6

Model in Action

Fast usage with pipelines:

###libraries required
!pip install transformers
from transformers import pipeline

### usage pipelines
model_checkpoint = "hackathon-pln-es/twitter_sexismo-finetuned-exist2021-metwo" 
pipeline_nlp = pipeline("text-classification", model=model_checkpoint)
pipeline_nlp("mujer al volante peligro!") 
#pipeline_nlp("¡me encanta el ipad!") 
#pipeline_nlp (["mujer al volante peligro!", "Los hombre tienen más manias que las mujeres", "me encanta el ipad!"] )

# OUTPUT MODEL #
# LABEL_0: "NON SEXISM"or LABEL_1: "SEXISM"  and score: probability of accuracy per model.

# [{'label': 'LABEL_1', 'score': 0.9967633485794067}]
# [{'label': 'LABEL_0', 'score': 0.9934417009353638}]

#[{‘label': 'LABEL_1', 'score': 0.9967633485794067},
# {'label': 'LABEL_1', 'score': 0.9755664467811584},
# {'label': 'LABEL_0', 'score': 0.9955045580863953}]