20230823015034

This model is a fine-tuned version of bert-large-cased on the super_glue dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0704
  • Accuracy: 0.4729

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.01
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 11
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 60.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
No log 1.0 312 0.3443 0.5271
0.5033 2.0 624 0.0706 0.4946
0.5033 3.0 936 0.0765 0.5199
0.0871 4.0 1248 0.0783 0.5235
0.089 5.0 1560 0.0730 0.4729
0.089 6.0 1872 0.1001 0.4729
0.089 7.0 2184 0.0714 0.4729
0.089 8.0 2496 0.0726 0.5054
0.0851 9.0 2808 0.0780 0.5271
0.1659 10.0 3120 0.0799 0.5271
0.1659 11.0 3432 0.0717 0.5379
0.0968 12.0 3744 0.0706 0.4729
0.085 13.0 4056 0.0829 0.5271
0.085 14.0 4368 0.0776 0.5271
0.088 15.0 4680 0.0760 0.4729
0.088 16.0 4992 0.2058 0.5271
0.084 17.0 5304 0.0726 0.4910
0.0906 18.0 5616 0.0708 0.4729
0.0906 19.0 5928 0.0714 0.4729
0.0865 20.0 6240 0.0707 0.4729
0.0852 21.0 6552 0.0733 0.4729
0.0852 22.0 6864 0.0843 0.5271
0.0848 23.0 7176 0.0849 0.4729
0.0848 24.0 7488 0.0877 0.4729
0.0837 25.0 7800 0.0704 0.4729
0.0828 26.0 8112 0.0740 0.5271
0.0828 27.0 8424 0.0710 0.4729
0.0856 28.0 8736 0.0717 0.4729
0.0836 29.0 9048 0.0715 0.4729
0.0836 30.0 9360 0.0709 0.4657
0.0813 31.0 9672 0.0891 0.5271
0.0813 32.0 9984 0.0711 0.4874
0.0824 33.0 10296 0.0753 0.4729
0.0825 34.0 10608 0.0797 0.5271
0.0825 35.0 10920 0.0710 0.4729
0.0819 36.0 11232 0.0739 0.4729
0.0811 37.0 11544 0.0743 0.4729
0.0811 38.0 11856 0.0731 0.4729
0.0816 39.0 12168 0.0707 0.4693
0.0816 40.0 12480 0.0706 0.4729
0.0804 41.0 12792 0.0716 0.5451
0.0805 42.0 13104 0.0703 0.4729
0.0805 43.0 13416 0.0720 0.5271
0.0801 44.0 13728 0.0711 0.4729
0.08 45.0 14040 0.0716 0.5307
0.08 46.0 14352 0.0706 0.4729
0.0795 47.0 14664 0.0727 0.4729
0.0795 48.0 14976 0.0703 0.4729
0.0792 49.0 15288 0.0716 0.4729
0.0791 50.0 15600 0.0705 0.4729
0.0791 51.0 15912 0.0706 0.4729
0.0793 52.0 16224 0.0715 0.4729
0.0785 53.0 16536 0.0703 0.4729
0.0785 54.0 16848 0.0704 0.4729
0.0778 55.0 17160 0.0724 0.4729
0.0778 56.0 17472 0.0706 0.4729
0.0779 57.0 17784 0.0706 0.4729
0.0777 58.0 18096 0.0708 0.4729
0.0777 59.0 18408 0.0704 0.4729
0.0777 60.0 18720 0.0704 0.4729

Framework versions

  • Transformers 4.26.1
  • Pytorch 2.0.1+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
12
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Dataset used to train dkqjrm/20230823015034