calculator_model_test_third_version

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.1341

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
3.4031	1.0	6	2.7477
2.3423	2.0	12	2.0656
1.7839	3.0	18	1.6402
1.6119	4.0	24	1.5371
1.544	5.0	30	1.4939
1.4631	6.0	36	1.4346
1.4052	7.0	42	1.3510
1.3137	8.0	48	1.2456
1.2297	9.0	54	1.2067
1.2454	10.0	60	1.2082
1.121	11.0	66	1.0614
1.0353	12.0	72	0.9978
1.0028	13.0	78	1.0926
0.993	14.0	84	0.9551
0.9267	15.0	90	0.9013
0.8728	16.0	96	0.9764
0.9072	17.0	102	0.8508
0.8457	18.0	108	0.8541
0.8278	19.0	114	0.7950
0.7903	20.0	120	0.7892
0.7726	21.0	126	0.7708
0.7789	22.0	132	0.7830
0.7515	23.0	138	0.8062
0.7691	24.0	144	0.7276
0.7203	25.0	150	0.7205
0.7119	26.0	156	0.7131
0.6776	27.0	162	0.6892
0.6926	28.0	168	0.7582
0.7128	29.0	174	0.9174
0.8055	30.0	180	0.7222
0.7423	31.0	186	0.6740
0.6712	32.0	192	0.7917
0.6965	33.0	198	0.6726
0.652	34.0	204	0.7449
0.6963	35.0	210	0.6932
0.6652	36.0	216	0.6286
0.6164	37.0	222	0.5777
0.5848	38.0	228	0.5556
0.5657	39.0	234	0.5788
0.5631	40.0	240	0.5216
0.5315	41.0	246	0.5156
0.5277	42.0	252	0.5486
0.5498	43.0	258	0.4877
0.4836	44.0	264	0.5947
0.555	45.0	270	0.4725
0.4804	46.0	276	0.4367
0.4537	47.0	282	0.4729
0.4668	48.0	288	0.3988
0.4507	49.0	294	0.4808
0.5128	50.0	300	0.4311
0.4444	51.0	306	0.4709
0.4538	52.0	312	0.3786
0.4213	53.0	318	0.3962
0.4067	54.0	324	0.3765
0.3931	55.0	330	0.4016
0.3946	56.0	336	0.3674
0.4095	57.0	342	0.3445
0.3817	58.0	348	0.3252
0.3528	59.0	354	0.3171
0.3527	60.0	360	0.3465
0.3562	61.0	366	0.3992
0.4265	62.0	372	0.3743
0.3734	63.0	378	0.3598
0.3585	64.0	384	0.3008
0.3438	65.0	390	0.2719
0.3289	66.0	396	0.2876
0.3128	67.0	402	0.2764
0.3106	68.0	408	0.2986
0.3058	69.0	414	0.2567
0.286	70.0	420	0.2762
0.2857	71.0	426	0.2732
0.2921	72.0	432	0.2728
0.3118	73.0	438	0.2352
0.2701	74.0	444	0.2204
0.2622	75.0	450	0.2114
0.2449	76.0	456	0.2262
0.2542	77.0	462	0.2446
0.259	78.0	468	0.2187
0.2852	79.0	474	0.2329
0.2587	80.0	480	0.2101
0.2491	81.0	486	0.2165
0.2291	82.0	492	0.1921
0.2286	83.0	498	0.1815
0.2095	84.0	504	0.1700
0.2256	85.0	510	0.1640
0.2088	86.0	516	0.1848
0.2087	87.0	522	0.1745
0.2025	88.0	528	0.1655
0.2003	89.0	534	0.1717
0.2007	90.0	540	0.1682
0.1862	91.0	546	0.1629
0.2005	92.0	552	0.1482
0.2003	93.0	558	0.1600
0.1876	94.0	564	0.1498
0.1929	95.0	570	0.1405
0.1772	96.0	576	0.1404
0.1797	97.0	582	0.1366
0.1734	98.0	588	0.1352
0.1686	99.0	594	0.1345
0.177	100.0	600	0.1341

Framework versions

Transformers 4.38.2
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.15.2

Marcin1304
/

calculator_model_test_third_version

calculator_model_test_third_version

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results