anas-awadalla's picture
update model card README.md
0cb23ca
|
raw
history blame
2.28 kB
metadata
license: mit
tags:
  - generated_from_trainer
datasets:
  - squad
model-index:
  - name: bert-tiny-finetuned-squad
    results: []

bert-tiny-finetuned-squad

This model is a fine-tuned version of prajjwal1/bert-tiny on the squad dataset. It achieves the following results on the evaluation set:

  • Loss: 2.3478

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss
2.7198 1.0 2767 2.5916
2.6502 2.0 5534 2.5309
2.5589 3.0 8301 2.4849
2.5128 4.0 11068 2.4474
2.4631 5.0 13835 2.4226
2.4165 6.0 16602 2.3987
2.3634 7.0 19369 2.4002
2.3243 8.0 22136 2.3855
2.2799 9.0 24903 2.3827
2.2596 10.0 27670 2.3641
2.2133 11.0 30437 2.3464
2.1911 12.0 33204 2.3551
2.1621 13.0 35971 2.3553
2.1421 14.0 38738 2.3584
2.1202 15.0 41505 2.3501
2.1154 16.0 44272 2.3548
2.1237 17.0 47039 2.3456
2.0952 18.0 49806 2.3507
2.0886 19.0 52573 2.3500
2.1062 20.0 55340 2.3478

Framework versions

  • Transformers 4.15.0
  • Pytorch 1.10.0+cu111
  • Datasets 1.18.0
  • Tokenizers 0.10.3