distilbert-finetuned-lr1e-07-epochs50

This model is a fine-tuned version of distilbert-base-cased-distilled-squad on the None dataset. It achieves the following results on the evaluation set:

Loss: 5.0791

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 1e-07
train_batch_size: 16
eval_batch_size: 16
seed: 42
optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
lr_scheduler_type: linear
num_epochs: 50

Training results

Training Loss	Epoch	Step	Validation Loss
No log	1.0	10	6.3771
No log	2.0	20	6.2726
No log	3.0	30	6.1763
No log	4.0	40	6.0874
No log	5.0	50	6.0031
No log	6.0	60	5.9324
No log	7.0	70	5.8631
No log	8.0	80	5.7979
No log	9.0	90	5.7419
No log	10.0	100	5.6885
No log	11.0	110	5.6398
No log	12.0	120	5.5935
No log	13.0	130	5.5529
No log	14.0	140	5.5161
No log	15.0	150	5.4811
No log	16.0	160	5.4510
No log	17.0	170	5.4228
No log	18.0	180	5.3964
No log	19.0	190	5.3720
No log	20.0	200	5.3489
No log	21.0	210	5.3276
No log	22.0	220	5.3074
No log	23.0	230	5.2877
No log	24.0	240	5.2698
No log	25.0	250	5.2526
No log	26.0	260	5.2368
No log	27.0	270	5.2228
No log	28.0	280	5.2092
No log	29.0	290	5.1971
No log	30.0	300	5.1854
No log	31.0	310	5.1746
No log	32.0	320	5.1642
No log	33.0	330	5.1541
No log	34.0	340	5.1452
No log	35.0	350	5.1367
No log	36.0	360	5.1286
No log	37.0	370	5.1218
No log	38.0	380	5.1156
No log	39.0	390	5.1093
No log	40.0	400	5.1039
No log	41.0	410	5.0988
No log	42.0	420	5.0947
No log	43.0	430	5.0909
No log	44.0	440	5.0877
No log	45.0	450	5.0850
No log	46.0	460	5.0829
No log	47.0	470	5.0811
No log	48.0	480	5.0799
No log	49.0	490	5.0793
5.1249	50.0	500	5.0791

Framework versions

Transformers 4.28.1
Pytorch 2.0.0+cu118
Datasets 2.12.0
Tokenizers 0.13.3

gallyamovi
/

distilbert-finetuned-lr1e-07-epochs50

distilbert-finetuned-lr1e-07-epochs50

Model description

Intended uses & limitations

Training and evaluation data

Training procedure

Training hyperparameters

Training results

Framework versions

Evaluation results