gpt2-ear_1-hs_cn_decay

This model is a fine-tuned version of gpt2-medium on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
74.0475	0.02	10	72.6562
45.5005	0.04	20	34.4060
10.921	0.06	30	11.2525
2.7976	0.08	40	4.4890
0.0358	0.1	50	2.1808
-1.4128	0.12	60	1.1029
-1.7651	0.14	70	0.9684
-1.9542	0.16	80	0.7785
-2.1013	0.18	90	0.6533
-2.14	0.2	100	0.6666
-2.1001	0.22	110	0.6334
-2.1169	0.24	120	0.5926
-2.1216	0.26	130	0.5903
-2.1191	0.28	140	0.5741
-2.1319	0.3	150	0.5702
-2.122	0.32	160	0.5679
-2.0754	0.34	170	0.5671
-2.06	0.36	180	0.5630
-2.0477	0.38	190	0.5591
-2.0569	0.4	200	0.5546
-1.9666	0.42	210	0.5513
-1.9673	0.44	220	0.5558
-2.0044	0.46	230	0.5560
-1.9923	0.48	240	0.5507
-1.9056	0.5	250	0.5494
-1.9658	0.52	260	0.5498
-1.9104	0.54	270	0.5474
-1.8967	0.56	280	0.5459
-1.8759	0.58	290	0.5458
-1.8432	0.6	300	0.5477
-1.835	0.62	310	0.5446
-1.7823	0.64	320	0.5448
-1.8058	0.66	330	0.5412
-1.8138	0.68	340	0.5375
-1.7656	0.7	350	0.5385
-1.7174	0.72	360	0.5376
-1.7461	0.74	370	0.5365
-1.7145	0.76	380	0.5342
-1.6737	0.78	390	0.5326
-1.7162	0.8	400	0.5342
-1.6792	0.82	410	0.5361
-1.6783	0.84	420	0.5369