metadata

language:
  - en

Text Classification GoEmotions

This model is a fined-tuned version of nreimers/MiniLMv2-L6-H384-distilled-from-BERT-Large on the on the Jigsaw 1st Kaggle competition dataset using unitary/toxic-bert as teacher model.

Load the Model

from transformers import pipeline

pipe = pipeline(model='Ngit/MiniLM-L6-toxic-all-labels', task='text-classification')
pipe("This is pure trash")
# [{'label': 'toxic', 'score': 0.9383478164672852}]

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 6e-05
train_batch_size: 48
eval_batch_size: 48
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 10
warmup_ratio: 0.1

Metrics (comparison with teacher model)

Teacher (params)	Student (params)	Set (metric)	Score (teacher)	Score (student)
unitary/toxic-bert (110M)	MiniLMv2-L6-H384-goemotions-v2 (23M)	Test (ROC_AUC)	0.98636	0.98600

Deployment

Check this repository to see how to deploy in a serveless enviroment with onnnxruntime