calculator_model_test_second_version

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1239

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
3.4193	1.0	6	2.7303
2.3495	2.0	12	1.9121
1.7677	3.0	18	1.6227
1.6093	4.0	24	1.5830
1.5528	5.0	30	1.5309
1.5114	6.0	36	1.4526
1.4513	7.0	42	1.3988
1.4022	8.0	48	1.3442
1.3473	9.0	54	1.2809
1.2985	10.0	60	1.2159
1.2173	11.0	66	1.1870
1.1373	12.0	72	1.0899
1.0855	13.0	78	1.0270
1.04	14.0	84	0.9607
1.0274	15.0	90	0.9749
0.9975	16.0	96	0.9045
0.9274	17.0	102	0.9247
0.8963	18.0	108	0.8161
0.8767	19.0	114	0.8131
0.8764	20.0	120	0.9056
0.8763	21.0	126	0.7668
0.8097	22.0	132	0.8305
0.8	23.0	138	0.7579
0.7483	24.0	144	0.7418
0.8242	25.0	150	0.7103
0.7375	26.0	156	0.6743
0.7078	27.0	162	0.6516
0.7112	28.0	168	0.7178
0.7518	29.0	174	0.7132
0.6874	30.0	180	0.6438
0.6671	31.0	186	0.6512
0.6595	32.0	192	0.6338
0.6375	33.0	198	0.5772
0.5933	34.0	204	0.5397
0.5938	35.0	210	0.5182
0.5818	36.0	216	0.5315
0.6946	37.0	222	0.9134
0.7946	38.0	228	0.7031
0.7079	39.0	234	0.6212
0.6055	40.0	240	0.5024
0.5524	41.0	246	0.5142
0.543	42.0	252	0.4946
0.5265	43.0	258	0.4820
0.5339	44.0	264	0.6029
0.5624	45.0	270	0.5800
0.5097	46.0	276	0.4858
0.5059	47.0	282	0.4554
0.4807	48.0	288	0.4538
0.4824	49.0	294	0.4248
0.4691	50.0	300	0.3919
0.5413	51.0	306	0.5179
0.5131	52.0	312	0.3809
0.4312	53.0	318	0.3955
0.4226	54.0	324	0.3597
0.4059	55.0	330	0.3501
0.3887	56.0	336	0.3281
0.3784	57.0	342	0.3294
0.3696	58.0	348	0.2937
0.3694	59.0	354	0.3153
0.3815	60.0	360	0.2878
0.3575	61.0	366	0.3236
0.3527	62.0	372	0.2940
0.3481	63.0	378	0.2703
0.3466	64.0	384	0.3331
0.4037	65.0	390	0.3615
0.363	66.0	396	0.3057
0.3374	67.0	402	0.2810
0.3256	68.0	408	0.2785
0.3206	69.0	414	0.2553
0.306	70.0	420	0.2336
0.2884	71.0	426	0.2361
0.2892	72.0	432	0.2257
0.275	73.0	438	0.2237
0.2968	74.0	444	0.2405
0.2879	75.0	450	0.2139
0.2832	76.0	456	0.2139
0.2726	77.0	462	0.2174
0.2687	78.0	468	0.2037
0.2609	79.0	474	0.1833
0.2518	80.0	480	0.1836
0.253	81.0	486	0.1861
0.2417	82.0	492	0.1650
0.2279	83.0	498	0.1706
0.2323	84.0	504	0.1785
0.225	85.0	510	0.1694
0.2194	86.0	516	0.1586
0.2217	87.0	522	0.1575
0.2093	88.0	528	0.1497
0.2109	89.0	534	0.1562
0.2081	90.0	540	0.1549
0.2027	91.0	546	0.1419
0.1982	92.0	552	0.1347
0.1951	93.0	558	0.1355
0.1893	94.0	564	0.1338
0.1881	95.0	570	0.1336
0.1911	96.0	576	0.1303
0.1862	97.0	582	0.1289
0.1882	98.0	588	0.1301
0.1792	99.0	594	0.1250
0.176	100.0	600	0.1239

Framework versions

Transformers 4.38.2
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.15.2

Marcin1304
/

calculator_model_test_second_version

calculator_model_test_second_version

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results