|
--- |
|
license: apache-2.0 |
|
tags: |
|
- |
|
datasets: |
|
- EXIST Dataset |
|
- MeTwo Machismo and Sexism Twitter Identification dataset |
|
|
|
metrics: |
|
- accuracy |
|
model-index: |
|
- name: twitter_sexismo-finetuned-exist2021 |
|
results: |
|
- task: |
|
name: Text Classification |
|
type: text-classification |
|
dataset: |
|
name: EXIST Dataset |
|
type: EXIST Dataset |
|
args: es |
|
metrics: |
|
- name: Accuracy |
|
type: accuracy |
|
value: 0.83 |
|
--- |
|
|
|
# twitter_sexismo-finetuned-exist2021 |
|
|
|
This model is a fine-tuned version of [pysentimiento/robertuito-hate-speech](https://huggingface.co/pysentimiento/robertuito-hate-speech) on the EXIST dataset and MeTwo: Machismo and Sexism Twitter Identification dataset https://github.com/franciscorodriguez92/MeTwo. |
|
It achieves the following results on the evaluation set: |
|
- Loss: 0.54 |
|
- Accuracy: 0.83 |
|
|
|
## Model description |
|
|
|
Modelo para el Hackaton de Somos NLP para detección de sexismo en twitts en español. Creado por: |
|
|
|
medardodt |
|
|
|
MariaIsabel |
|
|
|
ManRo |
|
|
|
lucel172 |
|
|
|
robertou2 |
|
|
|
## Intended uses & limitations |
|
|
|
More information needed |
|
|
|
## Training and evaluation data |
|
|
|
More information needed |
|
|
|
## Training procedure |
|
|
|
### Training hyperparameters |
|
|
|
The following hyperparameters were used during training: |
|
- my_learning_rate = 5E-5 |
|
- my_adam_epsilon = 1E-8 |
|
- my_number_of_epochs = 8 |
|
- my_warmup = 3 |
|
- my_mini_batch_size = 32 |
|
- optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08 |
|
- lr_scheduler_type: linear |
|
- num_epochs: 8 |
|
|
|
### Training results |
|
Epoch Training Loss Validation Loss Accuracy F1 Precision Recall |
|
|
|
1 0.389900 0.397857 0.827133 0.699620 0.786325 0.630137 |
|
|
|
2 0.064400 0.544625 0.831510 0.707224 0.794872 0.636986 |
|
|
|
3 0.004800 0.837723 0.818381 0.704626 0.733333 0.678082 |
|
|
|
4 0.000500 1.045066 0.820569 0.702899 0.746154 0.664384 |
|
|
|
5 0.000200 1.172727 0.805252 0.669145 0.731707 0.616438 |
|
|
|
6 0.000200 1.202422 0.827133 0.720848 0.744526 0.698630 |
|
|
|
7 0.000000 1.195012 0.827133 0.718861 0.748148 0.691781 |
|
|
|
8 0.000100 1.215515 0.824945 0.705882 0.761905 0.657534 |
|
|
|
9 0.000100 1.233099 0.827133 0.710623 0.763780 0.664384 |
|
|
|
10 0.000100 1.237268 0.829322 0.713235 0.769841 0.664384 |
|
|
|
|
|
|
|
### Framework versions |
|
|
|
- Transformers 4.17.0 |
|
- Pytorch 1.10.0+cu111 |
|
- Tokenizers 0.11.6 |
|
|
|
|
|
## Model in Action |
|
Fast usage with pipelines: |
|
|
|
###libraries required |
|
!pip install transformers |
|
from transformers import pipeline |
|
|
|
### usage pipelines |
|
model_checkpoint = "hackathon-pln-es/twitter_sexismo-finetuned-exist2021-metwo" |
|
pipeline_nlp = pipeline("text-classification", model=model_checkpoint) |
|
pipeline_nlp("mujer al volante peligro!") |
|
#pipeline_nlp("¡me encanta el ipad!") |
|
#pipeline_nlp (["mujer al volante peligro!", "Los hombre tienen más manias que las mujeres", "me encanta el ipad!"] ) |
|
|
|
# OUTPUT MODEL |
|
# LABEL_0: "NON SEXISM", LABEL_1: "SEXISM" score: probability of accuracy per model |
|
|
|
# [{'label': 'LABEL_1', 'score': 0.9967633485794067}] |
|
# [{'label': 'LABEL_0', 'score': 0.9934417009353638}] |
|
|
|
#[{‘label': 'LABEL_1', 'score': 0.9967633485794067}, |
|
# {'label': 'LABEL_1', 'score': 0.9755664467811584}, |
|
# {'label': 'LABEL_0', 'score': 0.9955045580863953}] |
|
|