metadata

license: mit
base_model: gpt2
tags:
  - generated_from_trainer
model-index:
  - name: polynomial_1450_7e-4_16b_w0.075
    results: []

polynomial_1450_7e-4_16b_w0.075

This model is a fine-tuned version of gpt2 on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
9.0635	0.1029	50	7.2772
6.7176	0.2058	100	6.2557
6.0147	0.3088	150	5.7199
5.5539	0.4117	200	5.3464
5.2292	0.5146	250	5.0625
4.9374	0.6175	300	4.7728
4.6985	0.7205	350	4.5613
4.4993	0.8234	400	4.3770
4.3227	0.9263	450	4.1914
4.1342	1.0292	500	4.0022
3.8927	1.1322	550	3.8166
3.757	1.2351	600	3.6654
3.6277	1.3380	650	3.5614
3.5379	1.4409	700	3.4772
3.4642	1.5438	750	3.4076
3.395	1.6468	800	3.3542
3.3287	1.7497	850	3.3034
3.2872	1.8526	900	3.2609
3.2545	1.9555	950	3.2268
3.1229	2.0585	1000	3.1925
3.0573	2.1614	1050	3.1616
3.0339	2.2643	1100	3.1372
3.0204	2.3672	1150	3.1133
2.99	2.4702	1200	3.0949
2.9809	2.5731	1250	3.0720
2.9524	2.6760	1300	3.0536
2.9267	2.7789	1350	3.0392
2.9453	2.8818	1400	3.0295
2.93	2.9848	1450	3.0229