calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.0317

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 200
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
3.6147	1.0	6	3.0415
2.585	2.0	12	2.0622
1.899	3.0	18	1.6836
1.6095	4.0	24	1.6248
1.5731	5.0	30	1.5728
1.5407	6.0	36	1.5594
1.5353	7.0	42	1.5163
1.4873	8.0	48	1.4470
1.4322	9.0	54	1.4274
1.3743	10.0	60	1.3462
1.3009	11.0	66	1.2124
1.1918	12.0	72	1.1226
1.1449	13.0	78	1.1215
1.0914	14.0	84	1.0471
1.0285	15.0	90	0.9795
1.0093	16.0	96	1.0062
0.9957	17.0	102	0.9296
0.944	18.0	108	0.9268
0.9194	19.0	114	0.9417
0.9014	20.0	120	0.8415
0.8789	21.0	126	0.7764
0.8155	22.0	132	0.7639
0.8593	23.0	138	0.8127
0.836	24.0	144	0.7397
0.7613	25.0	150	0.7067
0.7818	26.0	156	0.7217
0.7702	27.0	162	0.7128
0.7376	28.0	168	0.7242
0.8006	29.0	174	0.7117
0.7561	30.0	180	0.6767
0.7185	31.0	186	0.6828
0.7055	32.0	192	0.6215
0.6967	33.0	198	0.6766
0.7193	34.0	204	0.6238
0.6791	35.0	210	0.5900
0.6741	36.0	216	0.6307
0.663	37.0	222	0.6012
0.6326	38.0	228	0.5944
0.6041	39.0	234	0.5459
0.617	40.0	240	0.5786
0.6369	41.0	246	0.5896
0.6243	42.0	252	0.5446
0.5921	43.0	258	0.4864
0.5529	44.0	264	0.5561
0.5757	45.0	270	0.5783
0.5919	46.0	276	0.5235
0.5509	47.0	282	0.4525
0.5229	48.0	288	0.5007
0.5871	49.0	294	0.5009
0.5793	50.0	300	0.5431
0.5922	51.0	306	0.5404
0.5539	52.0	312	0.5386
0.5785	53.0	318	0.4697
0.5528	54.0	324	0.5061
0.5047	55.0	330	0.4249
0.475	56.0	336	0.4206
0.5236	57.0	342	0.5689
0.576	58.0	348	0.4258
0.4862	59.0	354	0.4070
0.4946	60.0	360	0.4136
0.4527	61.0	366	0.3848
0.4522	62.0	372	0.4288
0.5087	63.0	378	0.5660
0.5559	64.0	384	0.5371
0.5153	65.0	390	0.4595
0.4503	66.0	396	0.3648
0.4191	67.0	402	0.3787
0.4522	68.0	408	0.3469
0.4096	69.0	414	0.3622
0.4502	70.0	420	0.3613
0.4138	71.0	426	0.3700
0.3896	72.0	432	0.3920
0.4271	73.0	438	0.3354
0.4107	74.0	444	0.3193
0.391	75.0	450	0.3352
0.373	76.0	456	0.3818
0.4296	77.0	462	0.3238
0.3812	78.0	468	0.3337
0.3756	79.0	474	0.3105
0.3579	80.0	480	0.3433
0.4325	81.0	486	0.3103
0.356	82.0	492	0.3060
0.3467	83.0	498	0.3780
0.3922	84.0	504	0.2863
0.3457	85.0	510	0.2865
0.3755	86.0	516	0.3041
0.3319	87.0	522	0.2777
0.3359	88.0	528	0.3803
0.4192	89.0	534	0.3473
0.3941	90.0	540	0.3745
0.3991	91.0	546	0.3331
0.3489	92.0	552	0.3579
0.3352	93.0	558	0.2947
0.3202	94.0	564	0.2416
0.3339	95.0	570	0.3635
0.4108	96.0	576	0.2779
0.3827	97.0	582	0.2846
0.3559	98.0	588	0.2754
0.2985	99.0	594	0.2107
0.264	100.0	600	0.1958
0.2807	101.0	606	0.2028
0.2861	102.0	612	0.2034
0.2661	103.0	618	0.1979
0.264	104.0	624	0.2134
0.2747	105.0	630	0.1754
0.2785	106.0	636	0.2329
0.2656	107.0	642	0.1934
0.2505	108.0	648	0.2213
0.2572	109.0	654	0.2313
0.2929	110.0	660	0.2308
0.2419	111.0	666	0.1780
0.239	112.0	672	0.1694
0.2279	113.0	678	0.1580
0.2528	114.0	684	0.3002
0.3297	115.0	690	0.2676
0.3147	116.0	696	0.3287
0.2826	117.0	702	0.1475
0.2033	118.0	708	0.1359
0.1938	119.0	714	0.1592
0.2105	120.0	720	0.1696
0.2196	121.0	726	0.1532
0.2102	122.0	732	0.1157
0.2014	123.0	738	0.1835
0.2505	124.0	744	0.1851
0.2411	125.0	750	0.2881
0.2353	126.0	756	0.1911
0.2268	127.0	762	0.1874
0.2024	128.0	768	0.1613
0.2046	129.0	774	0.1938
0.199	130.0	780	0.1129
0.1703	131.0	786	0.1511
0.1924	132.0	792	0.1744
0.1854	133.0	798	0.1238
0.1632	134.0	804	0.1050
0.1589	135.0	810	0.1316
0.1787	136.0	816	0.0895
0.1658	137.0	822	0.0836
0.141	138.0	828	0.1087
0.1671	139.0	834	0.1068
0.1557	140.0	840	0.0800
0.1488	141.0	846	0.1277
0.1709	142.0	852	0.1126
0.1499	143.0	858	0.0913
0.1597	144.0	864	0.0829
0.1314	145.0	870	0.0762
0.1501	146.0	876	0.0897
0.156	147.0	882	0.0902
0.1482	148.0	888	0.0903
0.1401	149.0	894	0.0749
0.1322	150.0	900	0.0781
0.1309	151.0	906	0.0719
0.1326	152.0	912	0.0691
0.1311	153.0	918	0.0701
0.1202	154.0	924	0.0742
0.1258	155.0	930	0.0728
0.1183	156.0	936	0.0566
0.1181	157.0	942	0.0541
0.1137	158.0	948	0.0662
0.1061	159.0	954	0.0662
0.1121	160.0	960	0.0628
0.1038	161.0	966	0.0609
0.1135	162.0	972	0.0728
0.1317	163.0	978	0.0785
0.1149	164.0	984	0.0753
0.1111	165.0	990	0.0647
0.0926	166.0	996	0.0592
0.0931	167.0	1002	0.0554
0.0865	168.0	1008	0.0480
0.0881	169.0	1014	0.0498
0.0932	170.0	1020	0.0524
0.0934	171.0	1026	0.0629
0.1054	172.0	1032	0.0561
0.0933	173.0	1038	0.0422
0.0812	174.0	1044	0.0605
0.0953	175.0	1050	0.0485
0.0963	176.0	1056	0.0394
0.0731	177.0	1062	0.0378
0.0758	178.0	1068	0.0394
0.0703	179.0	1074	0.0406
0.0756	180.0	1080	0.0427
0.0812	181.0	1086	0.0538
0.0842	182.0	1092	0.0434
0.0773	183.0	1098	0.0439
0.073	184.0	1104	0.0379
0.0707	185.0	1110	0.0422
0.0749	186.0	1116	0.0420
0.0746	187.0	1122	0.0388
0.068	188.0	1128	0.0386
0.0654	189.0	1134	0.0378
0.0647	190.0	1140	0.0335
0.0629	191.0	1146	0.0402
0.0642	192.0	1152	0.0344
0.063	193.0	1158	0.0374
0.0631	194.0	1164	0.0321
0.0605	195.0	1170	0.0356
0.065	196.0	1176	0.0334
0.0591	197.0	1182	0.0321
0.0558	198.0	1188	0.0317
0.0596	199.0	1194	0.0316
0.06	200.0	1200	0.0317

Framework versions

Transformers 4.38.1
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.15.2

Kielak2
/

calculator_model_test

calculator_model_test

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results