Edit model card

QA_BERT_50_epoch

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 4.9109

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 160
  • eval_batch_size: 80
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50

Training results

Training Loss Epoch Step Validation Loss
No log 1.0 1 6.2090
No log 2.0 2 6.0772
No log 3.0 3 5.9414
No log 4.0 4 5.7888
No log 5.0 5 5.6358
No log 6.0 6 5.4781
No log 7.0 7 5.3259
No log 8.0 8 5.1904
No log 9.0 9 5.0758
No log 10.0 10 4.9715
No log 11.0 11 4.8743
No log 12.0 12 4.7854
No log 13.0 13 4.6994
No log 14.0 14 4.6228
No log 15.0 15 4.5529
No log 16.0 16 4.4915
No log 17.0 17 4.4442
No log 18.0 18 4.4073
No log 19.0 19 4.3770
No log 20.0 20 4.3537
No log 21.0 21 4.3292
No log 22.0 22 4.3063
No log 23.0 23 4.2858
No log 24.0 24 4.2697
No log 25.0 25 4.2673
No log 26.0 26 4.2773
No log 27.0 27 4.2956
No log 28.0 28 4.3234
No log 29.0 29 4.3491
No log 30.0 30 4.3719
No log 31.0 31 4.3902
No log 32.0 32 4.4187
No log 33.0 33 4.4387
No log 34.0 34 4.4576
No log 35.0 35 4.4799
No log 36.0 36 4.5204
No log 37.0 37 4.5620
No log 38.0 38 4.6024
No log 39.0 39 4.6346
No log 40.0 40 4.6566
No log 41.0 41 4.6756
No log 42.0 42 4.6900
No log 43.0 43 4.6994
No log 44.0 44 4.7294
No log 45.0 45 4.7631
No log 46.0 46 4.7912
No log 47.0 47 4.8192
No log 48.0 48 4.8414
No log 49.0 49 4.8611
No log 50.0 50 4.9109

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.2+cu121
  • Datasets 2.14.7
  • Tokenizers 0.15.0
Downloads last month
1
Safetensors
Model size
108M params
Tensor type
F32