Edit model card

mobilebert_sa_pre-training-complete

This model is a fine-tuned version of google/mobilebert-uncased on the wikitext wikitext-103-raw-v1 dataset. It achieves the following results on the evaluation set:

  • Loss: 1.3239
  • Accuracy: 0.7162

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 32
  • eval_batch_size: 32
  • seed: 10
  • distributed_type: multi-GPU
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 300000

Training results

Training Loss Epoch Step Validation Loss Accuracy
1.6028 1.0 7145 1.4525 0.6935
1.5524 2.0 14290 1.4375 0.6993
1.5323 3.0 21435 1.4194 0.6993
1.5191 4.0 28580 1.4110 0.7027
1.5025 5.0 35725 1.4168 0.7014
1.4902 6.0 42870 1.3931 0.7012
1.4813 7.0 50015 1.3738 0.7057
1.4751 8.0 57160 1.4237 0.6996
1.4689 9.0 64305 1.3969 0.7047
1.4626 10.0 71450 1.3916 0.7068
1.4566 11.0 78595 1.3686 0.7072
1.451 12.0 85740 1.3811 0.7060
1.4478 13.0 92885 1.3598 0.7092
1.4441 14.0 100030 1.3790 0.7054
1.4379 15.0 107175 1.3794 0.7066
1.4353 16.0 114320 1.3609 0.7102
1.43 17.0 121465 1.3685 0.7083
1.4278 18.0 128610 1.3953 0.7036
1.4219 19.0 135755 1.3756 0.7085
1.4197 20.0 142900 1.3597 0.7090
1.4169 21.0 150045 1.3673 0.7061
1.4146 22.0 157190 1.3753 0.7073
1.4109 23.0 164335 1.3696 0.7082
1.4073 24.0 171480 1.3563 0.7092
1.4054 25.0 178625 1.3712 0.7103
1.402 26.0 185770 1.3528 0.7113
1.4001 27.0 192915 1.3367 0.7123
1.397 28.0 200060 1.3508 0.7118
1.3955 29.0 207205 1.3572 0.7117
1.3937 30.0 214350 1.3566 0.7095
1.3901 31.0 221495 1.3515 0.7117
1.3874 32.0 228640 1.3445 0.7118
1.386 33.0 235785 1.3611 0.7097
1.3833 34.0 242930 1.3502 0.7087
1.3822 35.0 250075 1.3657 0.7108
1.3797 36.0 257220 1.3576 0.7108
1.3793 37.0 264365 1.3472 0.7106
1.3763 38.0 271510 1.3323 0.7156
1.3762 39.0 278655 1.3325 0.7145
1.3748 40.0 285800 1.3243 0.7138
1.3733 41.0 292945 1.3218 0.7170
1.3722 41.99 300000 1.3074 0.7186

Framework versions

  • Transformers 4.26.0
  • Pytorch 1.14.0a0+410ce96
  • Datasets 2.9.0
  • Tokenizers 0.13.2
Downloads last month
10

Dataset used to train gokuls/mobilebert_sa_pre-training-complete

Evaluation results