metadata

tags:
  - generated_from_trainer
model-index:
  - name: calculator_model_test_third_version
    results: []

calculator_model_test_third_version

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Loss: 0.2789

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100

Training results

Training Loss	Epoch	Step	Validation Loss
3.4163	1.0	6	2.7578
2.3764	2.0	12	1.9673
1.794	3.0	18	1.6871
1.682	4.0	24	1.6571
1.6127	5.0	30	1.5569
1.5398	6.0	36	1.5012
1.5125	7.0	42	1.5110
1.521	8.0	48	1.4432
1.4421	9.0	54	1.3939
1.3958	10.0	60	1.3717
1.3615	11.0	66	1.2885
1.2466	12.0	72	1.2391
1.2277	13.0	78	1.1908
1.1821	14.0	84	1.2211
1.1512	15.0	90	1.1313
1.1268	16.0	96	1.0571
1.0752	17.0	102	1.0546
1.0988	18.0	108	1.0526
1.0499	19.0	114	1.0063
0.9957	20.0	120	0.9662
1.0018	21.0	126	0.9062
0.9747	22.0	132	0.9273
0.9545	23.0	138	1.0973
1.0842	24.0	144	1.1854
1.0838	25.0	150	1.0031
0.9812	26.0	156	1.0129
1.0154	27.0	162	0.9693
0.9441	28.0	168	0.8257
0.863	29.0	174	0.8138
0.8591	30.0	180	0.8746
0.8913	31.0	186	0.8592
0.8719	32.0	192	0.7773
0.8187	33.0	198	0.7999
0.8013	34.0	204	0.7491
0.7976	35.0	210	0.7171
0.8033	36.0	216	0.7610
0.7785	37.0	222	0.8047
0.8141	38.0	228	0.7245
0.7726	39.0	234	0.6725
0.7832	40.0	240	0.8410
0.8625	41.0	246	0.7093
0.7307	42.0	252	0.6675
0.6832	43.0	258	0.6891
0.7126	44.0	264	0.7927
0.7808	45.0	270	0.7757
0.8031	46.0	276	0.6749
0.728	47.0	282	0.7299
0.741	48.0	288	0.6062
0.6772	49.0	294	0.6449
0.6505	50.0	300	0.6041
0.6315	51.0	306	0.5769
0.6339	52.0	312	0.6003
0.6388	53.0	318	0.6115
0.6436	54.0	324	0.5778
0.5993	55.0	330	0.5777
0.5936	56.0	336	0.5614
0.5873	57.0	342	0.5445
0.5819	58.0	348	0.5437
0.5523	59.0	354	0.4961
0.5527	60.0	360	0.4939
0.5622	61.0	366	0.4922
0.5603	62.0	372	0.5252
0.612	63.0	378	0.5024
0.6284	64.0	384	0.5152
0.573	65.0	390	0.5300
0.5407	66.0	396	0.4879
0.5266	67.0	402	0.4813
0.526	68.0	408	0.4341
0.5306	69.0	414	0.4817
0.5108	70.0	420	0.4127
0.5079	71.0	426	0.5083
0.5237	72.0	432	0.4423
0.5049	73.0	438	0.4948
0.491	74.0	444	0.4121
0.484	75.0	450	0.4047
0.4668	76.0	456	0.4041
0.4669	77.0	462	0.3987
0.4524	78.0	468	0.4115
0.4604	79.0	474	0.3926
0.4536	80.0	480	0.3970
0.4747	81.0	486	0.3674
0.4417	82.0	492	0.3905
0.458	83.0	498	0.4045
0.4393	84.0	504	0.3889
0.431	85.0	510	0.3427
0.4076	86.0	516	0.3621
0.4239	87.0	522	0.3368
0.4089	88.0	528	0.3353
0.3936	89.0	534	0.3253
0.3899	90.0	540	0.3173
0.3792	91.0	546	0.3065
0.3774	92.0	552	0.3060
0.3677	93.0	558	0.3063
0.3655	94.0	564	0.2909
0.3608	95.0	570	0.2944
0.3561	96.0	576	0.2842
0.3593	97.0	582	0.2906
0.357	98.0	588	0.2800
0.3558	99.0	594	0.2751
0.3456	100.0	600	0.2789

Framework versions

Transformers 4.38.2
Pytorch 2.1.0+cu121
Datasets 2.18.0
Tokenizers 0.15.2