distilbert_agnews_padding80model

This model is a fine-tuned version of distilbert-base-uncased on the ag_news dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss	Accuracy
0.1898	1.0	7500	0.1891	0.9404
0.142	2.0	15000	0.1977	0.9436
0.1261	3.0	22500	0.2154	0.9470
0.0881	4.0	30000	0.2706	0.9462
0.0656	5.0	37500	0.2971	0.9397
0.045	6.0	45000	0.3819	0.9393
0.0377	7.0	52500	0.4230	0.9380
0.0269	8.0	60000	0.4504	0.9412
0.0284	9.0	67500	0.4358	0.9432
0.0155	10.0	75000	0.4849	0.9412
0.0159	11.0	82500	0.5002	0.9430
0.0112	12.0	90000	0.5023	0.9418
0.007	13.0	97500	0.4904	0.9425
0.0082	14.0	105000	0.5366	0.9457
0.005	15.0	112500	0.5462	0.9438
0.0046	16.0	120000	0.5607	0.9439
0.001	17.0	127500	0.5958	0.9441
0.0038	18.0	135000	0.6113	0.9459
0.0009	19.0	142500	0.6338	0.9463
0.0002	20.0	150000	0.6430	0.9454