gpt2-wikitext2

This model is a fine-tuned version of openai-community/gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

Training Loss	Epoch	Step	Validation Loss
No log	1.0	55	9.1458
No log	2.0	110	8.3471
No log	3.0	165	7.7884
No log	4.0	220	7.3751
No log	5.0	275	7.0487
No log	6.0	330	6.7857
No log	7.0	385	6.5840
No log	8.0	440	6.4196
No log	9.0	495	6.2584
7.3272	10.0	550	6.1628
7.3272	11.0	605	6.0521
7.3272	12.0	660	5.9861
7.3272	13.0	715	5.9223
7.3272	14.0	770	5.8760
7.3272	15.0	825	5.8246
7.3272	16.0	880	5.7813
7.3272	17.0	935	5.7663
7.3272	18.0	990	5.7275
5.2638	19.0	1045	5.7022
5.2638	20.0	1100	5.6905
5.2638	21.0	1155	5.6803
5.2638	22.0	1210	5.6740
5.2638	23.0	1265	5.6631
5.2638	24.0	1320	5.6461
5.2638	25.0	1375	5.6326
5.2638	26.0	1430	5.6280
5.2638	27.0	1485	5.6408
4.5099	28.0	1540	5.6194
4.5099	29.0	1595	5.6255
4.5099	30.0	1650	5.6218
4.5099	31.0	1705	5.6127
4.5099	32.0	1760	5.6140
4.5099	33.0	1815	5.6281
4.5099	34.0	1870	5.6305
4.5099	35.0	1925	5.6139
4.5099	36.0	1980	5.6331
4.0571	37.0	2035	5.6323
4.0571	38.0	2090	5.6137
4.0571	39.0	2145	5.6258
4.0571	40.0	2200	5.6322
4.0571	41.0	2255	5.6392
4.0571	42.0	2310	5.6308
4.0571	43.0	2365	5.6329
4.0571	44.0	2420	5.6373
4.0571	45.0	2475	5.6407
3.7638	46.0	2530	5.6489
3.7638	47.0	2585	5.6489
3.7638	48.0	2640	5.6445
3.7638	49.0	2695	5.6428
3.7638	50.0	2750	5.6425
3.7638	51.0	2805	5.6450
3.7638	52.0	2860	5.6566
3.7638	53.0	2915	5.6504
3.7638	54.0	2970	5.6494
3.5759	55.0	3025	5.6538
3.5759	56.0	3080	5.6555
3.5759	57.0	3135	5.6529
3.5759	58.0	3190	5.6567
3.5759	59.0	3245	5.6551
3.5759	60.0	3300	5.6547

Safetensors

Model size

124M params

Tensor type

F32

Base model

Finetuned

(1858)

this model