47163343_0

This model is a fine-tuned version of openai-community/gpt2 on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.6417
Accuracy: 0.0001

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1.41e-05
train_batch_size: 32
eval_batch_size: 4
seed: 42
distributed_type: multi-GPU
num_devices: 2
gradient_accumulation_steps: 4
total_train_batch_size: 256
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
training_steps: 2000

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy
1.0667	0.48	25	0.9624	0.0001
0.854	0.97	50	0.8076	0.0001
0.7965	1.45	75	0.7675	0.0002
0.7817	1.93	100	0.7461	0.0001
0.7668	2.42	125	0.7326	0.0001
0.7533	2.9	150	0.7225	0.0001
0.7479	3.38	175	0.7150	0.0001
0.7325	3.86	200	0.7104	0.0001
0.7397	4.35	225	0.7055	0.0001
0.7255	4.83	250	0.7020	0.0001
0.7182	5.31	275	0.6983	0.0001
0.7161	5.8	300	0.6955	0.0001
0.7162	6.28	325	0.6925	0.0001
0.7043	6.76	350	0.6901	0.0001
0.7076	7.25	375	0.6868	0.0001
0.7042	7.73	400	0.6844	0.0001
0.6975	8.21	425	0.6820	0.0001
0.7021	8.7	450	0.6799	0.0001
0.6955	9.18	475	0.6778	0.0001
0.6868	9.66	500	0.6763	0.0001
0.6866	10.14	525	0.6744	0.0001
0.6903	10.63	550	0.6723	0.0001
0.6786	11.11	575	0.6709	0.0001
0.6843	11.59	600	0.6695	0.0001
0.6835	12.08	625	0.6680	0.0001
0.6819	12.56	650	0.6669	0.0001
0.6804	13.04	675	0.6656	0.0001
0.6748	13.53	700	0.6642	0.0001
0.674	14.01	725	0.6637	0.0001
0.6731	14.49	750	0.6624	0.0001
0.681	14.98	775	0.6611	0.0001
0.6763	15.46	800	0.6602	0.0001
0.677	15.94	825	0.6597	0.0001
0.6725	16.43	850	0.6583	0.0001
0.6669	16.91	875	0.6574	0.0001
0.6682	17.39	900	0.6567	0.0001
0.669	17.87	925	0.6559	0.0001
0.6647	18.36	950	0.6554	0.0001
0.664	18.84	975	0.6549	0.0001
0.6563	19.32	1000	0.6542	0.0001
0.6656	19.81	1025	0.6533	0.0001
0.6634	20.29	1050	0.6530	0.0001
0.6592	20.77	1075	0.6521	0.0001
0.6558	21.26	1100	0.6514	0.0001
0.6664	21.74	1125	0.6511	0.0001
0.6561	22.22	1150	0.6504	0.0001
0.6634	22.71	1175	0.6499	0.0001
0.6679	23.19	1200	0.6491	0.0001
0.6625	23.67	1225	0.6489	0.0001
0.6619	24.15	1250	0.6483	0.0001
0.6495	24.64	1275	0.6479	0.0001
0.6547	25.12	1300	0.6474	0.0001
0.6649	25.6	1325	0.6469	0.0001
0.6551	26.09	1350	0.6466	0.0001
0.6547	26.57	1375	0.6463	0.0001
0.6546	27.05	1400	0.6458	0.0001
0.6576	27.54	1425	0.6456	0.0001
0.6576	28.02	1450	0.6452	0.0001
0.6568	28.5	1475	0.6448	0.0001
0.6596	28.99	1500	0.6446	0.0001
0.6538	29.47	1525	0.6443	0.0001
0.6488	29.95	1550	0.6440	0.0001
0.6433	30.43	1575	0.6437	0.0001
0.6583	30.92	1600	0.6435	0.0001
0.6575	31.4	1625	0.6432	0.0001
0.6465	31.88	1650	0.6430	0.0001
0.6495	32.37	1675	0.6429	0.0001
0.6487	32.85	1700	0.6427	0.0001
0.6571	33.33	1725	0.6426	0.0001
0.6463	33.82	1750	0.6425	0.0001
0.648	34.3	1775	0.6423	0.0001
0.6537	34.78	1800	0.6422	0.0001
0.6564	35.27	1825	0.6420	0.0001
0.6491	35.75	1850	0.6420	0.0001
0.6549	36.23	1875	0.6419	0.0001
0.6524	36.71	1900	0.6418	0.0001
0.6522	37.2	1925	0.6418	0.0001
0.655	37.68	1950	0.6417	0.0001
0.6614	38.16	1975	0.6417	0.0001
0.6451	38.65	2000	0.6417	0.0001

Framework versions

Transformers 4.37.0
Pytorch 2.0.0+cu118
Datasets 2.16.1
Tokenizers 0.15.1

ke-lly
/

47163343_0

47163343_0

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Model tree for ke-lly/47163343_0

Evaluation results