Edit model card

bert-large-cased-sigir-support-refute-no-label-40

This model is a fine-tuned version of bert-large-cased on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.8371

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 4e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
2.4511 1.0 252 2.0790
2.0373 2.0 504 1.8538
1.8052 3.0 756 1.6633
1.6663 4.0 1008 1.5591
1.5556 5.0 1260 1.4441
1.4505 6.0 1512 1.3836
1.3619 7.0 1764 1.3255
1.2968 8.0 2016 1.2505
1.2332 9.0 2268 1.2165
1.1788 10.0 2520 1.1517
1.1408 11.0 2772 1.1446
1.0992 12.0 3024 1.1512
1.0578 13.0 3276 1.1058
1.0277 14.0 3528 1.0662
1.0036 15.0 3780 1.0270
0.9655 16.0 4032 1.0207
0.9364 17.0 4284 1.0220
0.9085 18.0 4536 0.9874
0.8897 19.0 4788 0.9658
0.8661 20.0 5040 0.9603
0.8434 21.0 5292 0.9754
0.8248 22.0 5544 0.9406
0.8052 23.0 5796 0.9154
0.7975 24.0 6048 0.8760
0.7854 25.0 6300 0.8688
0.7673 26.0 6552 0.8536
0.7463 27.0 6804 0.8544
0.7412 28.0 7056 0.8514
0.7319 29.0 7308 0.8356
0.7143 30.0 7560 0.8832
0.7081 31.0 7812 0.8421
0.7026 32.0 8064 0.8295
0.687 33.0 8316 0.8401
0.6882 34.0 8568 0.8053
0.679 35.0 8820 0.8438
0.6672 36.0 9072 0.8450
0.6669 37.0 9324 0.8231
0.6665 38.0 9576 0.8410
0.6596 39.0 9828 0.7909
0.6556 40.0 10080 0.8019

Framework versions

  • Transformers 4.26.1
  • Pytorch 1.13.1+cu116
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
0