qa-indo-math-k-v2

This model was trained from scratch on an unkown dataset. It achieves the following results on the evaluation set:

Loss: 1.9328

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 2e-05
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 100
mixed_precision_training: Native AMP

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	80	0.7969
No log	2.0	160	0.7612
No log	3.0	240	0.7624
No log	4.0	320	0.7424
No log	5.0	400	0.7634
No log	6.0	480	0.7415
0.9241	7.0	560	0.7219
0.9241	8.0	640	0.7792
0.9241	9.0	720	0.7803
0.9241	10.0	800	0.7666
0.9241	11.0	880	0.7614
0.9241	12.0	960	0.7616
0.6373	13.0	1040	0.7673
0.6373	14.0	1120	0.7818
0.6373	15.0	1200	0.8030
0.6373	16.0	1280	0.8021
0.6373	17.0	1360	0.8025
0.6373	18.0	1440	0.8628
0.5614	19.0	1520	0.8616
0.5614	20.0	1600	0.8739
0.5614	21.0	1680	0.8647
0.5614	22.0	1760	0.9006
0.5614	23.0	1840	0.9560
0.5614	24.0	1920	0.9395
0.486	25.0	2000	0.9453
0.486	26.0	2080	0.9569
0.486	27.0	2160	1.0208
0.486	28.0	2240	0.9860
0.486	29.0	2320	0.9806
0.486	30.0	2400	1.0681
0.486	31.0	2480	1.1085
0.4126	32.0	2560	1.1028
0.4126	33.0	2640	1.1110
0.4126	34.0	2720	1.1573
0.4126	35.0	2800	1.1387
0.4126	36.0	2880	1.2067
0.4126	37.0	2960	1.2079
0.3559	38.0	3040	1.2152
0.3559	39.0	3120	1.2418
0.3559	40.0	3200	1.2023
0.3559	41.0	3280	1.2679
0.3559	42.0	3360	1.3178
0.3559	43.0	3440	1.3419
0.3084	44.0	3520	1.4702
0.3084	45.0	3600	1.3824
0.3084	46.0	3680	1.4227
0.3084	47.0	3760	1.3925
0.3084	48.0	3840	1.4940
0.3084	49.0	3920	1.4110
0.2686	50.0	4000	1.4534
0.2686	51.0	4080	1.4749
0.2686	52.0	4160	1.5351
0.2686	53.0	4240	1.5479
0.2686	54.0	4320	1.4755
0.2686	55.0	4400	1.5207
0.2686	56.0	4480	1.5075
0.2388	57.0	4560	1.5470
0.2388	58.0	4640	1.5361
0.2388	59.0	4720	1.5914
0.2388	60.0	4800	1.6430
0.2388	61.0	4880	1.6249
0.2388	62.0	4960	1.5503
0.2046	63.0	5040	1.6441
0.2046	64.0	5120	1.6789
0.2046	65.0	5200	1.6174
0.2046	66.0	5280	1.6175
0.2046	67.0	5360	1.6947
0.2046	68.0	5440	1.6299
0.1891	69.0	5520	1.7419
0.1891	70.0	5600	1.8442
0.1891	71.0	5680	1.8802
0.1891	72.0	5760	1.8233
0.1891	73.0	5840	1.8172
0.1891	74.0	5920	1.8181
0.1664	75.0	6000	1.8399
0.1664	76.0	6080	1.8128
0.1664	77.0	6160	1.8423
0.1664	78.0	6240	1.8380
0.1664	79.0	6320	1.8941
0.1664	80.0	6400	1.8636
0.1664	81.0	6480	1.7949
0.1614	82.0	6560	1.8342
0.1614	83.0	6640	1.8123
0.1614	84.0	6720	1.8639
0.1614	85.0	6800	1.8580
0.1614	86.0	6880	1.8816
0.1614	87.0	6960	1.8579
0.1487	88.0	7040	1.8783
0.1487	89.0	7120	1.9175
0.1487	90.0	7200	1.9025
0.1487	91.0	7280	1.9207
0.1487	92.0	7360	1.9195
0.1487	93.0	7440	1.9142
0.1355	94.0	7520	1.9333
0.1355	95.0	7600	1.9238
0.1355	96.0	7680	1.9256
0.1355	97.0	7760	1.9305
0.1355	98.0	7840	1.9294
0.1355	99.0	7920	1.9301
0.1297	100.0	8000	1.9328

Framework versions

Transformers 4.6.1
Pytorch 1.7.0
Datasets 1.11.0
Tokenizers 0.10.3

fadhilarkan
/

qa-indo-math-k-v2

qa-indo-math-k-v2

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results

Model card error