somosnlp-hackathon-2022
/

twitter_sexismo-finetuned-exist2021-metwo

Text Classification

Inference Endpoints

Model card Files Files and versions Community

twitter_sexismo-finetuned-exist2021-metwo / README.md

ManRo's picture

Update README.md

af584d6 over 2 years ago

|

3.15 kB

	---
	license: apache-2.0
	tags:
	-
	datasets:
	- EXIST Dataset
	- MeTwo Machismo and Sexism Twitter Identification dataset

	metrics:
	- accuracy
	model-index:
	- name: twitter_sexismo-finetuned-exist2021
	results:
	- task:
	name: Text Classification
	type: text-classification
	dataset:
	name: EXIST Dataset
	type: EXIST Dataset
	args: es
	metrics:
	- name: Accuracy
	type: accuracy
	value: 0.83
	---

	# twitter_sexismo-finetuned-exist2021

	This model is a fine-tuned version of [pysentimiento/robertuito-hate-speech](https://huggingface.co/pysentimiento/robertuito-hate-speech) on the EXIST dataset and MeTwo: Machismo and Sexism Twitter Identification dataset https://github.com/franciscorodriguez92/MeTwo.
	It achieves the following results on the evaluation set:
	- Loss: 0.54
	- Accuracy: 0.83

	## Model description

	Modelo para el Hackaton de Somos NLP para detección de sexismo en twitts en español. Creado por:

	medardodt

	MariaIsabel

	ManRo

	lucel172

	robertou2

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- my_learning_rate = 5E-5
	- my_adam_epsilon = 1E-8
	- my_number_of_epochs = 8
	- my_warmup = 3
	- my_mini_batch_size = 32
	- optimizer: AdamW with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- num_epochs: 8

	### Training results
	Epoch Training Loss Validation Loss Accuracy F1 Precision Recall

	1 0.389900 0.397857 0.827133 0.699620 0.786325 0.630137

	2 0.064400 0.544625 0.831510 0.707224 0.794872 0.636986

	3 0.004800 0.837723 0.818381 0.704626 0.733333 0.678082

	4 0.000500 1.045066 0.820569 0.702899 0.746154 0.664384

	5 0.000200 1.172727 0.805252 0.669145 0.731707 0.616438

	6 0.000200 1.202422 0.827133 0.720848 0.744526 0.698630

	7 0.000000 1.195012 0.827133 0.718861 0.748148 0.691781

	8 0.000100 1.215515 0.824945 0.705882 0.761905 0.657534

	9 0.000100 1.233099 0.827133 0.710623 0.763780 0.664384

	10 0.000100 1.237268 0.829322 0.713235 0.769841 0.664384



	### Framework versions

	- Transformers 4.17.0
	- Pytorch 1.10.0+cu111
	- Tokenizers 0.11.6


	## Model in Action
	Fast usage with pipelines:

	###libraries required
	!pip install transformers
	from transformers import pipeline

	### usage pipelines
	model_checkpoint = "hackathon-pln-es/twitter_sexismo-finetuned-exist2021-metwo"
	pipeline_nlp = pipeline("text-classification", model=model_checkpoint)
	pipeline_nlp("mujer al volante peligro!")
	#pipeline_nlp("¡me encanta el ipad!")
	#pipeline_nlp (["mujer al volante peligro!", "Los hombre tienen más manias que las mujeres", "me encanta el ipad!"] )

	# OUTPUT MODEL
	# LABEL_0: "NON SEXISM", LABEL_1: "SEXISM" score: probability of accuracy per model

	# [{'label': 'LABEL_1', 'score': 0.9967633485794067}]
	# [{'label': 'LABEL_0', 'score': 0.9934417009353638}]

	#[{‘label': 'LABEL_1', 'score': 0.9967633485794067},
	# {'label': 'LABEL_1', 'score': 0.9755664467811584},
	# {'label': 'LABEL_0', 'score': 0.9955045580863953}]