125m-dalio-book-handwritten-io-constant-1e-6-v2

This model is a fine-tuned version of facebook/opt-125m on the AlekseyKorshuk/dalio-book-handwritten-io-sorted-v2 dataset. It achieves the following results on the evaluation set:

Loss: 3.0859
Accuracy: 0.2336
Perplexity: 21.8880

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-06
train_batch_size: 1
eval_batch_size: 1
seed: 42
distributed_type: multi-GPU
num_devices: 8
total_train_batch_size: 8
total_eval_batch_size: 8
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: constant
num_epochs: 1.0

Training results

Training Loss	Epoch	Step	Validation Loss	Accuracy	Perplexity
3.3352	0.01	1	3.1738	0.2305	23.8988
3.3091	0.03	2	3.1738	0.2305	23.8988
3.3347	0.04	3	3.1738	0.2305	23.8988
3.1445	0.05	4	3.1738	0.2305	23.8988
2.8918	0.07	5	3.1738	0.2305	23.8988
3.2068	0.08	6	3.1738	0.2305	23.8988
3.6245	0.09	7	3.1719	0.2305	23.8522
3.2256	0.11	8	3.1719	0.2305	23.8522
2.9991	0.12	9	3.1699	0.2305	23.8056
3.3257	0.13	10	3.1680	0.2306	23.7592
3.1199	0.15	11	3.1660	0.2306	23.7128
3.3735	0.16	12	3.1660	0.2306	23.7128
3.0051	0.17	13	3.1641	0.2307	23.6665
3.2695	0.19	14	3.1621	0.2308	23.6204
3.2004	0.2	15	3.1602	0.2309	23.5743
3.2075	0.21	16	3.1582	0.2308	23.5283
3.321	0.23	17	3.1562	0.2308	23.4824
3.4026	0.24	18	3.1543	0.2309	23.4366
3.0383	0.25	19	3.1523	0.2309	23.3908
3.166	0.27	20	3.1504	0.2309	23.3452
3.144	0.28	21	3.1484	0.2310	23.2996
3.1624	0.29	22	3.1484	0.2310	23.2996
3.0332	0.31	23	3.1465	0.2310	23.2542
3.3745	0.32	24	3.1445	0.2311	23.2088
3.0823	0.33	25	3.1426	0.2312	23.1635
3.6021	0.35	26	3.1406	0.2312	23.1183
3.1125	0.36	27	3.1387	0.2313	23.0732
3.1406	0.37	28	3.1387	0.2314	23.0732
3.1736	0.39	29	3.1367	0.2314	23.0282
3.1104	0.4	30	3.1348	0.2315	22.9832
3.1301	0.41	31	3.1328	0.2316	22.9384
3.3376	0.43	32	3.1309	0.2315	22.8936
3.218	0.44	33	3.1309	0.2316	22.8936
3.0786	0.45	34	3.1289	0.2316	22.8490
3.0125	0.47	35	3.1270	0.2317	22.8044
3.2634	0.48	36	3.1270	0.2317	22.8044
2.9888	0.49	37	3.125	0.2318	22.7599
3.1624	0.51	38	3.1230	0.2318	22.7155
2.9807	0.52	39	3.1211	0.2319	22.6712
3.446	0.53	40	3.1211	0.2319	22.6712
3.1338	0.55	41	3.1191	0.2320	22.6269
3.1841	0.56	42	3.1191	0.2320	22.6269
3.1079	0.57	43	3.1172	0.2320	22.5828
3.0918	0.59	44	3.1152	0.2321	22.5387
3.0302	0.6	45	3.1152	0.2322	22.5387
3.1123	0.61	46	3.1133	0.2323	22.4947
2.9985	0.63	47	3.1113	0.2324	22.4508
3.3816	0.64	48	3.1113	0.2324	22.4508
3.0813	0.65	49	3.1094	0.2324	22.4070
3.2024	0.67	50	3.1094	0.2325	22.4070
3.0178	0.68	51	3.1074	0.2325	22.3633
3.1646	0.69	52	3.1074	0.2326	22.3633
3.0046	0.71	53	3.1055	0.2327	22.3197
3.0266	0.72	54	3.1055	0.2327	22.3197
3.3857	0.73	55	3.1035	0.2327	22.2761
3.064	0.75	56	3.1035	0.2328	22.2761
3.176	0.76	57	3.1016	0.2328	22.2327
3.1851	0.77	58	3.1016	0.2329	22.2327
3.0811	0.79	59	3.0996	0.2329	22.1893
3.0205	0.8	60	3.0996	0.2330	22.1893
3.26	0.81	61	3.0977	0.2330	22.1460
3.2922	0.83	62	3.0977	0.2331	22.1460
3.5349	0.84	63	3.0957	0.2331	22.1028
3.3525	0.85	64	3.0957	0.2331	22.1028
3.135	0.87	65	3.0938	0.2331	22.0596
3.1707	0.88	66	3.0938	0.2332	22.0596
3.0127	0.89	67	3.0918	0.2332	22.0166
3.0952	0.91	68	3.0918	0.2332	22.0166
3.1023	0.92	69	3.0898	0.2334	21.9736
3.3821	0.93	70	3.0898	0.2334	21.9736
3.1118	0.95	71	3.0879	0.2334	21.9308
3.1143	0.96	72	3.0879	0.2335	21.9308
3.1118	0.97	73	3.0879	0.2335	21.9308
3.0596	0.99	74	3.0859	0.2336	21.8880
3.1033	1.0	75	3.0859	0.2336	21.8880

Framework versions

Transformers 4.25.0.dev0
Pytorch 1.12.1+cu113
Datasets 2.3.2
Tokenizers 0.12.1

AlekseyKorshuk
/

125m-dalio-book-handwritten-io-constant-1e-6-v2

125m-dalio-book-handwritten-io-constant-1e-6-v2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Dataset used to train AlekseyKorshuk/125m-dalio-book-handwritten-io-constant-1e-6-v2

Evaluation results