minuva
/

MiniLMv2-toxic-jigsaw

Text Classification

offensive language

Inference Endpoints

Model card Files Files and versions Community

Ngit commited on Dec 15, 2023

Commit

037dca5

•

1 Parent(s): c1846f4

Create README.md

Files changed (1) hide show

README.md +43 -0

README.md ADDED Viewed

	@@ -0,0 +1,43 @@

+---
+datasets:
+- go_emotions
+language:
+- en
+---
+# Text Classification GoEmotions
+This model is a fined-tuned version of [nreimers/MiniLMv2-L6-H384-distilled-from-BERT-Large](https://huggingface.co/nreimers/MiniLMv2-L6-H384-distilled-from-BERT-Large) on the on the [Jigsaw 1st Kaggle competition](https://www.kaggle.com/competitions/jigsaw-toxic-comment-classification-challenge) dataset using [tasinho/text-classification-goemotions](https://huggingface.co/tasinhoque/text-classification-goemotions) as teacher model.
+# Load the Model
+```py
+from transformers import pipeline
+pipe = pipeline(model='Ngit/MiniLM-L6-toxic-all-labels', task='text-classification')
+pipe("This is pure trash")
+# [{'label': 'toxic', 'score': 0.9383478164672852}]
+```
+# Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 6e-05
+- train_batch_size: 48
+- eval_batch_size: 48
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 10
+- warmup-ratio: 0.1
+# Metrics (comparison with teacher model)
+| Teacher (params)    |   Student (params)     | Set  (metric)     | Score (teacher)    |    Score (student)      |
+|--------------------|-------------|----------|--------| --------|
+| unitary/toxic-bert (110M) |      MiniLMv2-L6-H384-goemotions-v2 (33M)    | Test (ROC_AUC)  | 0.98636 |  0.98600 |
+# Training Code, Evaluation & Deployment
+Check