polynomial_1450_7e-4_32b_w0.2

This model is a fine-tuned version of gpt2 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
9.0501	0.2058	50	7.2516
6.6334	0.4117	100	6.1191
5.8403	0.6175	150	5.5171
5.347	0.8234	200	5.0809
4.9621	1.0292	250	4.7655
4.5909	1.2351	300	4.4418
4.3142	1.4409	350	4.1684
4.0577	1.6468	400	3.8857
3.7934	1.8526	450	3.6317
3.5603	2.0585	500	3.4786
3.3743	2.2643	550	3.3722
3.3003	2.4702	600	3.2932
3.2338	2.6760	650	3.2353
3.1788	2.8818	700	3.1763
3.0774	3.0877	750	3.1289
2.9735	3.2935	800	3.0953
2.9351	3.4994	850	3.0626
2.9367	3.7052	900	3.0310
2.9088	3.9111	950	3.0032
2.7944	4.1169	1000	2.9830
2.7402	4.3228	1050	2.9669
2.7293	4.5286	1100	2.9475
2.7184	4.7345	1150	2.9275
2.7029	4.9403	1200	2.9098
2.6065	5.1462	1250	2.9024
2.5699	5.3520	1300	2.8938
2.5511	5.5578	1350	2.8836
2.5503	5.7637	1400	2.8756
2.5435	5.9695	1450	2.8711