somosnlp-hackathon-2022
/

twitter_sexismo-finetuned-exist2021-metwo

Text Classification

Inference Endpoints

Model card Files Files and versions Community

twitter_sexismo-finetuned-exist2021-metwo / README.md

ManRo's picture

Update README.md

21afad4 about 2 years ago

|

raw history blame

3.42 kB

	---
	license: apache-2.0
	tags:
	-
	datasets:
	- EXIST Dataset
	- MeTwo Machismo and Sexism Twitter Identification dataset

	widget:
	- text: "manejas muy bien para ser mujer"
	- text: "En temas políticos hombres y mujeres son iguales"
	- text: "Los ipad son unos equipos electrónicos"

	metrics:
	- accuracy
	model-index:
	- name: twitter_sexismo-finetuned-exist2021
	results:
	- task:
	name: Text Classification
	type: text-classification
	dataset:
	name: EXIST Dataset
	type: EXIST Dataset
	args: es
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.83
	---

	# twitter_sexismo-finetuned-exist2021

	This model is a fine-tuned version of [pysentimiento/robertuito-hate-speech](https://huggingface.co/pysentimiento/robertuito-hate-speech) on the EXIST dataset and MeTwo: Machismo and Sexism Twitter Identification dataset https://github.com/franciscorodriguez92/MeTwo.
	It achieves the following results on the evaluation set:
	- Loss: 0.54
	- Accuracy: 0.83

	## Model description
	Model for the 'Somos NLP' Hackathon for detecting sexism in twitters in Spanish. Created by:
	- medardodt
	- MariaIsabel
	- ManRo
	- lucel172
	- robertou2

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- my_learning_rate = 5E-5
	- my_adam_epsilon = 1E-8
	- my_number_of_epochs = 8
	- my_warmup = 3
	- my_mini_batch_size = 32
	- optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 8

	### Training results

	\|Epoch\|Training Loss\|Validation Loss\|Accuracy\|F1\|Precision\|Precision\|
	\|----\|-------\|-------\|-------\|-------\|-------\|-------\|
	\|1\|0.389900 \|0.397857 \|0.827133 \|0.699620 \|0.786325 \|0.630137 \|
	\|2\|0.064400 \|0.544625 \|0.831510 \|0.707224 \|0.794872 \|0.636986 \|
	\|3\|0.004800 \|0.837723 \|0.818381 \|0.704626 \|0.733333 \|0.678082 \|
	\|4\|0.000500 \|1.045066 \|0.820569 \| 0.702899 \|0.746154 \|0.664384 \|
	\|5\|0.000200 \|1.172727 \|0.805252 \|0.669145 \|0.731707 \|0.616438 \|
	\|6\|0.000200 \|1.202422 \|0.827133 \|0.720848 \|0.744526 \|0.698630 \|
	\|7\|0.000000 \|1.195012 \|0.827133 \|0.718861 \|0.748148 \|0.691781 \|
	\|8\|0.000100 \|1.215515 \|0.824945 \|0.705882 \|0.761905 \|0.657534 \|
	\|9\|0.000100\|1.233099 \|0.827133 \|0.710623 \|0.763780 \|0.664384 \|
	\|10\|0.000100\|1.237268 \|0.829322 \|0.713235 \|0.769841 \|0.664384 \|

	### Framework versions

	- Transformers 4.17.0
	- Pytorch 1.10.0+cu111
	- Tokenizers 0.11.6


	## Model in Action
	Fast usage with pipelines:
	``` python
	###libraries required
	!pip install transformers
	from transformers import pipeline

	### usage pipelines
	model_checkpoint = "hackathon-pln-es/twitter_sexismo-finetuned-exist2021-metwo"
	pipeline_nlp = pipeline("text-classification", model=model_checkpoint)
	pipeline_nlp("mujer al volante peligro!")
	#pipeline_nlp("¡me encanta el ipad!")
	#pipeline_nlp (["mujer al volante peligro!", "Los hombre tienen más manias que las mujeres", "me encanta el ipad!"] )

	# OUTPUT MODEL #
	# LABEL_0: "NON SEXISM"or LABEL_1: "SEXISM" and score: probability of accuracy per model.

	# [{'label': 'LABEL_1', 'score': 0.9967633485794067}]
	# [{'label': 'LABEL_0', 'score': 0.9934417009353638}]

	#[{‘label': 'LABEL_1', 'score': 0.9967633485794067},
	# {'label': 'LABEL_1', 'score': 0.9755664467811584},
	# {'label': 'LABEL_0', 'score': 0.9955045580863953}]
	```