model_v1_complete_training_wt_init_48_tiny

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 3.6497
Accuracy: 0.3896

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-05
train_batch_size: 64
eval_batch_size: 64
seed: 10
distributed_type: multi-GPU
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
lr_scheduler_warmup_steps: 10000
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
6.0224	0.33	30000	5.9447	0.1517
5.1853	0.66	60000	4.9635	0.2615
4.9483	0.98	90000	4.7016	0.2830
4.7679	1.31	120000	4.5154	0.2992
4.6448	1.64	150000	4.3884	0.3100
4.5688	1.97	180000	4.3095	0.3175
4.5102	2.29	210000	4.2511	0.3236
4.4662	2.62	240000	4.2038	0.3294
4.4269	2.95	270000	4.1677	0.3336
4.3982	3.28	300000	4.1367	0.3370
4.3714	3.6	330000	4.1103	0.3399
4.3493	3.93	360000	4.0869	0.3423
4.3303	4.26	390000	4.0680	0.3439
4.3131	4.59	420000	4.0467	0.3461
4.2875	4.92	450000	4.0292	0.3477
4.2629	5.24	480000	4.0109	0.3497
4.2413	5.57	510000	3.9931	0.3515
4.2282	5.9	540000	3.9759	0.3536
4.2003	6.23	570000	3.9608	0.3551
4.1867	6.55	600000	3.9445	0.3571
4.1607	6.88	630000	3.9273	0.3590
4.1511	7.21	660000	3.9130	0.3606
4.1335	7.54	690000	3.8971	0.3622
4.1158	7.87	720000	3.8798	0.3642
4.097	8.19	750000	3.8635	0.3663
4.0831	8.52	780000	3.8494	0.3679
4.0756	8.85	810000	3.8334	0.3696
4.0533	9.18	840000	3.8201	0.3712
4.0517	9.5	870000	3.8080	0.3724
4.0325	9.83	900000	3.7975	0.3734
4.0142	10.16	930000	3.7872	0.3748
4.0124	10.49	960000	3.7788	0.3759
4.0076	10.81	990000	3.7679	0.3767
3.9919	11.14	1020000	3.7609	0.3775
3.9888	11.47	1050000	3.7550	0.3783
3.9796	11.8	1080000	3.7481	0.3789
3.9742	12.13	1110000	3.7414	0.3796
3.9667	12.45	1140000	3.7370	0.3802
3.9652	12.78	1170000	3.7289	0.3810
3.9548	13.11	1200000	3.7278	0.3812
3.9556	13.44	1230000	3.7213	0.3817
3.9444	13.76	1260000	3.7152	0.3825
3.9428	14.09	1290000	3.7120	0.3827
3.9424	14.42	1320000	3.7072	0.3834
3.9389	14.75	1350000	3.7047	0.3836
3.936	15.07	1380000	3.6998	0.3844
3.9246	15.4	1410000	3.6968	0.3847
3.9281	15.73	1440000	3.6925	0.3851
3.9177	16.06	1470000	3.6916	0.3849
3.9216	16.39	1500000	3.6870	0.3855
3.9141	16.71	1530000	3.6822	0.3863
3.9154	17.04	1560000	3.6804	0.3864
3.9145	17.37	1590000	3.6795	0.3863
3.9103	17.7	1620000	3.6734	0.3869
3.9079	18.02	1650000	3.6724	0.3873
3.901	18.35	1680000	3.6707	0.3872
3.9015	18.68	1710000	3.6695	0.3873
3.8987	19.01	1740000	3.6672	0.3877
3.8929	19.33	1770000	3.6647	0.3878
3.892	19.66	1800000	3.6609	0.3884
3.8906	19.99	1830000	3.6595	0.3886
3.8923	20.32	1860000	3.6594	0.3885
3.8901	20.65	1890000	3.6541	0.3893
3.8853	20.97	1920000	3.6539	0.3891
3.8808	21.3	1950000	3.6527	0.3894
3.8835	21.63	1980000	3.6497	0.3896

Framework versions

Transformers 4.30.2
Pytorch 1.14.0a0+410ce96
Datasets 2.13.0
Tokenizers 0.13.3

gokuls
/

model_v1_complete_training_wt_init_48_tiny

model_v1_complete_training_wt_init_48_tiny

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results