jojoUla's picture
update model card README.md
3a7c0c9
|
raw
history blame
3.33 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
model-index:
  - name: bert-large-cased-sigir-LR100-0-prepend-40
    results: []

bert-large-cased-sigir-LR100-0-prepend-40

This model is a fine-tuned version of bert-large-cased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1764

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 30
  • eval_batch_size: 30
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.6453 1.0 3 2.0522
2.0488 2.0 6 1.7600
1.9917 3.0 9 2.3036
1.6084 4.0 12 1.4050
1.856 5.0 15 1.3598
1.6471 6.0 18 1.5274
1.2358 7.0 21 1.6642
1.4355 8.0 24 1.6109
1.5753 9.0 27 1.8690
1.5374 10.0 30 1.7986
1.5063 11.0 33 1.4979
1.2185 12.0 36 0.7390
1.6042 13.0 39 1.1280
1.1938 14.0 42 1.1252
1.3215 15.0 45 1.6827
1.0789 16.0 48 1.6349
1.095 17.0 51 2.6303
1.0088 18.0 54 0.9429
1.015 19.0 57 1.4165
1.2432 20.0 60 2.1061
1.3365 21.0 63 1.5785
1.2704 22.0 66 2.1850
0.972 23.0 69 1.7769
0.9052 24.0 72 1.5376
0.976 25.0 75 2.1072
1.1134 26.0 78 2.4425
0.8328 27.0 81 1.5937
1.1662 28.0 84 1.3542
0.8575 29.0 87 1.2236
0.728 30.0 90 1.2229
1.1601 31.0 93 2.3723
0.9426 32.0 96 1.6974
0.8246 33.0 99 1.6610
0.9777 34.0 102 1.1179
0.7588 35.0 105 1.8809
0.6929 36.0 108 1.9128
0.6794 37.0 111 1.2689
0.811 38.0 114 1.6715
0.6805 39.0 117 2.0424
0.9157 40.0 120 1.4210

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.13.1+cu116
  • Datasets 2.9.0
  • Tokenizers 0.13.2