EIStakovskii
/

french_toxicity_classifier_plus

Text Classification

Inference Endpoints

Model card Files Files and versions Community

french_toxicity_classifier_plus / README.md

EIStakovskii's picture

Update README.md

6e306f6 almost 2 years ago

|

1.01 kB

	---
	language: fr # <-- my language
	widget:
	- text: "J'aime ta coiffure"
	- text: "Va te faire foutre"
	- text: "Quel mauvais temps, n'est-ce pas ?"
	- text: "J'espère que tu vas mourir, connard !"
	- text: "j'aime beaucoup ta veste"

	license: other
	---
	This model was trained for toxicity labeling. Label_1 means TOXIC, Label_0 means NOT_TOXIC

	The model was fine-tuned based off the CamemBERT language model https://huggingface.co/camembert-base .

	The accuracy is 93% on the test split during training and 79% on a manually picked (and thus harder) sample of 200 sentences (100 label 1, 100 label 0) at the end of the training.

	The model was finetuned on 32k sentences. The train data was the translations of the english data (around 30k sentences) from https://github.com/s-nlp/multilingual_detox with https://huggingface.co/Helsinki-NLP/opus-mt-en-fr and the data from the jigsaw dataset on kaggle https://www.kaggle.com/competitions/jigsaw-multilingual-toxic-comment-classification/data .