minilm-finetuned-movie

This model is a fine-tuned version of microsoft/miniLM-L12-H384-uncased on sasingh192/movie-review dataset. It achieves the following results on the evaluation set:

Loss: 0.0451
F1: 0.9856

Model description

This model can be used to categorize a movie review into of the following categories: 0 - negative 1 - somewhat negative 2 - neutral 3 - somewhat positive 4 - positive

Intended uses & limitations

The fined model is based on the finetuning of the model devloped by Wang et al.

@misc{wang2020minilm, title={MiniLM: Deep Self-Attention Distillation for Task-Agnostic Compression of Pre-Trained Transformers}, author={Wenhui Wang and Furu Wei and Li Dong and Hangbo Bao and Nan Yang and Ming Zhou}, year={2020}, eprint={2002.10957}, archivePrefix={arXiv}, primaryClass={cs.CL} }

Training and evaluation data

sasingh192/movie-review dataset contains a column 'TrainValTest'. The values provied in this columns are 'Train', 'Val', and 'Test'. The dataset can be filtered for the 'Train' values to train the model. Evaluation can be perfored on the data filtered by 'Val'. 'Test' is used as a blind test for kaggle.

Training procedure

Training details are listed below.

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 64
eval_batch_size: 64
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss	F1
0.9623	1.0	1946	0.7742	0.6985
0.7969	2.0	3892	0.7289	0.7094
0.74	3.0	5838	0.6479	0.7476
0.7012	4.0	7784	0.6263	0.7550
0.6689	5.0	9730	0.5823	0.7762
0.6416	6.0	11676	0.5796	0.7673
0.6149	7.0	13622	0.5324	0.7912
0.5939	8.0	15568	0.5189	0.7986
0.5714	9.0	17514	0.4793	0.8184
0.5495	10.0	19460	0.4566	0.8249
0.5297	11.0	21406	0.4155	0.8475
0.5101	12.0	23352	0.4063	0.8494
0.4924	13.0	25298	0.3829	0.8571
0.4719	14.0	27244	0.4032	0.8449
0.4552	15.0	29190	0.3447	0.8720
0.4382	16.0	31136	0.3581	0.8610
0.421	17.0	33082	0.3095	0.8835
0.4038	18.0	35028	0.2764	0.9002
0.3883	19.0	36974	0.2610	0.9051
0.3745	20.0	38920	0.2533	0.9064
0.3616	21.0	40866	0.2601	0.9005
0.345	22.0	42812	0.2085	0.9267
0.3314	23.0	44758	0.2421	0.9069
0.3178	24.0	46704	0.2006	0.9268
0.3085	25.0	48650	0.1846	0.9326
0.2964	26.0	50596	0.1492	0.9490
0.2855	27.0	52542	0.1664	0.9376
0.2737	28.0	54488	0.1309	0.9560
0.2641	29.0	56434	0.1318	0.9562
0.2541	30.0	58380	0.1490	0.9440
0.2462	31.0	60326	0.1195	0.9575
0.234	32.0	62272	0.1054	0.9640
0.2273	33.0	64218	0.1054	0.9631
0.2184	34.0	66164	0.0971	0.9662
0.214	35.0	68110	0.0902	0.9689
0.2026	36.0	70056	0.0846	0.9699
0.1973	37.0	72002	0.0819	0.9705
0.1934	38.0	73948	0.0810	0.9716
0.1884	39.0	75894	0.0724	0.9746
0.1796	40.0	77840	0.0737	0.9743
0.1779	41.0	79786	0.0665	0.9773
0.1703	42.0	81732	0.0568	0.9811
0.1638	43.0	83678	0.0513	0.9843
0.1601	44.0	85624	0.0575	0.9802
0.1593	45.0	87570	0.0513	0.9835
0.1559	46.0	89516	0.0474	0.9851
0.1514	47.0	91462	0.0477	0.9847
0.1473	48.0	93408	0.0444	0.9858
0.1462	49.0	95354	0.0449	0.9855
0.1458	50.0	97300	0.0451	0.9856

Framework versions

Transformers 4.29.2
Pytorch 2.0.1
Datasets 2.12.0
Tokenizers 0.13.2

sasingh192
/

minilm-finetuned-movie