Edit model card

my_awesome_eli5_mlm_model

This model is a fine-tuned version of bert-base-uncased on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.0690

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
1.657 1.0 1357 2.2376
2.0353 2.0 2714 2.1395
2.1131 3.0 4071 2.1349
2.0894 4.0 5428 2.1160
2.0451 5.0 6785 2.1023
2.0114 6.0 8142 2.0706
1.9932 7.0 9499 2.0818
1.9551 8.0 10856 2.0797
1.9218 9.0 12213 2.0679
1.9186 10.0 13570 2.0555
1.8722 11.0 14927 2.0491
1.8438 12.0 16284 2.0430
1.8256 13.0 17641 2.0785
1.816 14.0 18998 2.0475
1.766 15.0 20355 2.0607
1.7689 16.0 21712 2.0758
1.7354 17.0 23069 2.0443
1.7548 18.0 24426 2.0540
1.7188 19.0 25783 2.0538
1.6965 20.0 27140 2.0513
1.7066 21.0 28497 2.0490
1.6711 22.0 29854 2.0513
1.6549 23.0 31211 2.0515
1.6577 24.0 32568 2.0498
1.6214 25.0 33925 2.0438
1.6057 26.0 35282 2.0488
1.6001 27.0 36639 2.0541
1.6148 28.0 37996 2.0475
1.6062 29.0 39353 2.0325
1.5588 30.0 40710 2.0191
1.5607 31.0 42067 2.0388
1.5558 32.0 43424 2.0510
1.5453 33.0 44781 2.0344
1.5322 34.0 46138 2.0475
1.5437 35.0 47495 2.0348
1.5306 36.0 48852 2.0493
1.5184 37.0 50209 2.0489
1.5131 38.0 51566 2.0512
1.4835 39.0 52923 2.0650
1.4758 40.0 54280 2.0277
1.4841 41.0 55637 2.0662
1.4877 42.0 56994 2.0451
1.4691 43.0 58351 2.0401
1.4565 44.0 59708 2.0735
1.4654 45.0 61065 2.0493
1.4432 46.0 62422 2.0325
1.4763 47.0 63779 2.0493
1.4511 48.0 65136 2.0284
1.4633 49.0 66493 2.0282
1.4457 50.0 67850 2.0690

Framework versions

  • Transformers 4.37.0
  • Pytorch 2.1.2
  • Datasets 2.1.0
  • Tokenizers 0.15.1
Downloads last month
0
Safetensors
Model size
110M params
Tensor type
F32
·

Finetuned from