--- license: apache-2.0 tags: - generated_from_trainer datasets: - wikitext metrics: - accuracy model-index: - name: mobilebert_add_pre-training-complete results: - task: name: Masked Language Modeling type: fill-mask dataset: name: wikitext wikitext-103-raw-v1 type: wikitext config: wikitext-103-raw-v1 split: validation args: wikitext-103-raw-v1 metrics: - name: Accuracy type: accuracy value: 0.46066022044935584 --- # mobilebert_add_pre-training-complete This model is a fine-tuned version of [google/mobilebert-uncased](https://huggingface.co/google/mobilebert-uncased) on the wikitext wikitext-103-raw-v1 dataset. It achieves the following results on the evaluation set: - Loss: 2.9849 - Accuracy: 0.4607 ## Model description More information needed ## Intended uses & limitations More information needed ## Training and evaluation data More information needed ## Training procedure ### Training hyperparameters The following hyperparameters were used during training: - learning_rate: 5e-05 - train_batch_size: 64 - eval_batch_size: 64 - seed: 10 - distributed_type: multi-GPU - num_devices: 2 - total_train_batch_size: 128 - total_eval_batch_size: 128 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08 - lr_scheduler_type: linear - lr_scheduler_warmup_steps: 100 - training_steps: 300000 ### Training results | Training Loss | Epoch | Step | Validation Loss | Accuracy | |:-------------:|:------:|:------:|:---------------:|:--------:| | 4.8119 | 1.0 | 1787 | 4.3700 | 0.3199 | | 4.2649 | 2.0 | 3574 | 4.0930 | 0.3445 | | 4.0457 | 3.0 | 5361 | 3.9375 | 0.3545 | | 3.9099 | 4.0 | 7148 | 3.8534 | 0.3644 | | 3.8193 | 5.0 | 8935 | 3.7993 | 0.3669 | | 3.7517 | 6.0 | 10722 | 3.7414 | 0.3730 | | 3.6983 | 7.0 | 12509 | 3.6737 | 0.3818 | | 3.6565 | 8.0 | 14296 | 3.6657 | 0.3794 | | 3.619 | 9.0 | 16083 | 3.6129 | 0.3869 | | 3.5899 | 10.0 | 17870 | 3.5804 | 0.3910 | | 3.5597 | 11.0 | 19657 | 3.5432 | 0.3964 | | 3.5329 | 12.0 | 21444 | 3.5397 | 0.3958 | | 3.5088 | 13.0 | 23231 | 3.4896 | 0.4011 | | 3.4904 | 14.0 | 25018 | 3.4731 | 0.4000 | | 3.4703 | 15.0 | 26805 | 3.4971 | 0.3994 | | 3.4533 | 16.0 | 28592 | 3.4609 | 0.4049 | | 3.4369 | 17.0 | 30379 | 3.4411 | 0.4067 | | 3.423 | 18.0 | 32166 | 3.4219 | 0.4066 | | 3.4084 | 19.0 | 33953 | 3.4477 | 0.4014 | | 3.3949 | 20.0 | 35740 | 3.4013 | 0.4087 | | 3.3811 | 21.0 | 37527 | 3.3642 | 0.4130 | | 3.3688 | 22.0 | 39314 | 3.4173 | 0.4031 | | 3.3598 | 23.0 | 41101 | 3.4018 | 0.4101 | | 3.3484 | 24.0 | 42888 | 3.3499 | 0.4143 | | 3.3363 | 25.0 | 44675 | 3.3675 | 0.4119 | | 3.3274 | 26.0 | 46462 | 3.3562 | 0.4154 | | 3.3161 | 27.0 | 48249 | 3.3487 | 0.4159 | | 3.3073 | 28.0 | 50036 | 3.3293 | 0.4159 | | 3.2991 | 29.0 | 51823 | 3.3317 | 0.4160 | | 3.2899 | 30.0 | 53610 | 3.3058 | 0.4183 | | 3.2814 | 31.0 | 55397 | 3.2795 | 0.4235 | | 3.2734 | 32.0 | 57184 | 3.3185 | 0.4143 | | 3.266 | 33.0 | 58971 | 3.2682 | 0.4268 | | 3.2578 | 34.0 | 60758 | 3.3145 | 0.4181 | | 3.2506 | 35.0 | 62545 | 3.2726 | 0.4230 | | 3.2423 | 36.0 | 64332 | 3.2735 | 0.4218 | | 3.2359 | 37.0 | 66119 | 3.2845 | 0.4175 | | 3.2293 | 38.0 | 67906 | 3.3067 | 0.4193 | | 3.2207 | 39.0 | 69693 | 3.2586 | 0.4257 | | 3.2138 | 40.0 | 71480 | 3.2543 | 0.4250 | | 3.2077 | 41.0 | 73267 | 3.2395 | 0.4226 | | 3.202 | 42.0 | 75054 | 3.2224 | 0.4270 | | 3.1964 | 43.0 | 76841 | 3.2562 | 0.4234 | | 3.1925 | 44.0 | 78628 | 3.2544 | 0.4251 | | 3.1865 | 45.0 | 80415 | 3.2043 | 0.4353 | | 3.1812 | 46.0 | 82202 | 3.2280 | 0.4286 | | 3.1744 | 47.0 | 83989 | 3.2174 | 0.4276 | | 3.1699 | 48.0 | 85776 | 3.1972 | 0.4317 | | 3.1652 | 49.0 | 87563 | 3.2016 | 0.4302 | | 3.1609 | 50.0 | 89350 | 3.2018 | 0.4338 | | 3.1548 | 51.0 | 91137 | 3.1950 | 0.4327 | | 3.1508 | 52.0 | 92924 | 3.2128 | 0.4279 | | 3.1478 | 53.0 | 94711 | 3.2027 | 0.4303 | | 3.1423 | 54.0 | 96498 | 3.1959 | 0.4312 | | 3.1383 | 55.0 | 98285 | 3.1911 | 0.4340 | | 3.1336 | 56.0 | 100072 | 3.1914 | 0.4320 | | 3.129 | 57.0 | 101859 | 3.1855 | 0.4312 | | 3.1233 | 58.0 | 103646 | 3.1570 | 0.4337 | | 3.1198 | 59.0 | 105433 | 3.2042 | 0.4307 | | 3.1153 | 60.0 | 107220 | 3.1370 | 0.4390 | | 3.1122 | 61.0 | 109007 | 3.1612 | 0.4412 | | 3.1093 | 62.0 | 110794 | 3.1642 | 0.4348 | | 3.1048 | 63.0 | 112581 | 3.1807 | 0.4326 | | 3.1013 | 64.0 | 114368 | 3.1449 | 0.4359 | | 3.0977 | 65.0 | 116155 | 3.1408 | 0.4380 | | 3.0926 | 66.0 | 117942 | 3.1723 | 0.4365 | | 3.0901 | 67.0 | 119729 | 3.1473 | 0.4380 | | 3.0882 | 68.0 | 121516 | 3.1401 | 0.4378 | | 3.0839 | 69.0 | 123303 | 3.1281 | 0.4374 | | 3.0794 | 70.0 | 125090 | 3.1356 | 0.4367 | | 3.0766 | 71.0 | 126877 | 3.1019 | 0.4397 | | 3.074 | 72.0 | 128664 | 3.1626 | 0.4355 | | 3.0702 | 73.0 | 130451 | 3.1287 | 0.4387 | | 3.0676 | 74.0 | 132238 | 3.1366 | 0.4379 | | 3.0648 | 75.0 | 134025 | 3.1782 | 0.4346 | | 3.0624 | 76.0 | 135812 | 3.1229 | 0.4427 | | 3.0575 | 77.0 | 137599 | 3.1139 | 0.4430 | | 3.0549 | 78.0 | 139386 | 3.0948 | 0.4431 | | 3.052 | 79.0 | 141173 | 3.1030 | 0.4452 | | 3.0527 | 80.0 | 142960 | 3.0929 | 0.4448 | | 3.0466 | 81.0 | 144747 | 3.0888 | 0.4428 | | 3.0439 | 82.0 | 146534 | 3.1035 | 0.4414 | | 3.0409 | 83.0 | 148321 | 3.1112 | 0.4411 | | 3.041 | 84.0 | 150108 | 3.1296 | 0.4399 | | 3.0379 | 85.0 | 151895 | 3.1224 | 0.4428 | | 3.0332 | 86.0 | 153682 | 3.1101 | 0.4398 | | 3.0315 | 87.0 | 155469 | 3.1045 | 0.4423 | | 3.0302 | 88.0 | 157256 | 3.0913 | 0.4446 | | 3.0265 | 89.0 | 159043 | 3.0745 | 0.4447 | | 3.0243 | 90.0 | 160830 | 3.0942 | 0.4443 | | 3.0222 | 91.0 | 162617 | 3.0821 | 0.4432 | | 3.021 | 92.0 | 164404 | 3.0616 | 0.4473 | | 3.0183 | 93.0 | 166191 | 3.1021 | 0.4450 | | 3.0155 | 94.0 | 167978 | 3.1163 | 0.4422 | | 3.0132 | 95.0 | 169765 | 3.0645 | 0.4493 | | 3.0118 | 96.0 | 171552 | 3.0922 | 0.4420 | | 3.0105 | 97.0 | 173339 | 3.1187 | 0.4423 | | 3.0063 | 98.0 | 175126 | 3.1061 | 0.4462 | | 3.0035 | 99.0 | 176913 | 3.1098 | 0.4424 | | 3.0025 | 100.0 | 178700 | 3.0856 | 0.4454 | | 3.0001 | 101.0 | 180487 | 3.0584 | 0.4504 | | 2.9979 | 102.0 | 182274 | 3.0897 | 0.4435 | | 2.9963 | 103.0 | 184061 | 3.0712 | 0.4437 | | 2.9944 | 104.0 | 185848 | 3.0853 | 0.4458 | | 2.9931 | 105.0 | 187635 | 3.0809 | 0.4475 | | 2.992 | 106.0 | 189422 | 3.0910 | 0.4426 | | 2.9886 | 107.0 | 191209 | 3.0693 | 0.4490 | | 2.986 | 108.0 | 192996 | 3.0906 | 0.4445 | | 2.9834 | 109.0 | 194783 | 3.0320 | 0.4538 | | 2.9829 | 110.0 | 196570 | 3.0760 | 0.4456 | | 2.9814 | 111.0 | 198357 | 3.0423 | 0.4504 | | 2.9795 | 112.0 | 200144 | 3.0411 | 0.4529 | | 2.979 | 113.0 | 201931 | 3.0784 | 0.4463 | | 2.9781 | 114.0 | 203718 | 3.0526 | 0.4537 | | 2.9751 | 115.0 | 205505 | 3.0479 | 0.4512 | | 2.9749 | 116.0 | 207292 | 3.0545 | 0.4493 | | 2.9735 | 117.0 | 209079 | 3.0529 | 0.4485 | | 2.9705 | 118.0 | 210866 | 3.0080 | 0.4581 | | 2.9698 | 119.0 | 212653 | 3.0271 | 0.4537 | | 2.9674 | 120.0 | 214440 | 3.0477 | 0.4482 | | 2.9666 | 121.0 | 216227 | 3.0328 | 0.4558 | | 2.9664 | 122.0 | 218014 | 3.0689 | 0.4463 | | 2.9639 | 123.0 | 219801 | 3.0749 | 0.4459 | | 2.9633 | 124.0 | 221588 | 3.0505 | 0.4489 | | 2.9618 | 125.0 | 223375 | 3.0256 | 0.4535 | | 2.9589 | 126.0 | 225162 | 3.0522 | 0.4496 | | 2.9584 | 127.0 | 226949 | 3.0451 | 0.4530 | | 2.9589 | 128.0 | 228736 | 3.0654 | 0.4502 | | 2.9581 | 129.0 | 230523 | 2.9989 | 0.4580 | | 2.9554 | 130.0 | 232310 | 3.0347 | 0.4508 | | 2.9565 | 131.0 | 234097 | 3.0586 | 0.4498 | | 2.9548 | 132.0 | 235884 | 3.0170 | 0.4536 | | 2.9515 | 133.0 | 237671 | 3.0470 | 0.4492 | | 2.9499 | 134.0 | 239458 | 3.0339 | 0.4515 | | 2.9514 | 135.0 | 241245 | 3.0474 | 0.4473 | | 2.9486 | 136.0 | 243032 | 3.0427 | 0.4493 | | 2.9483 | 137.0 | 244819 | 3.0336 | 0.4534 | | 2.9491 | 138.0 | 246606 | 3.0274 | 0.4516 | | 2.9465 | 139.0 | 248393 | 3.0354 | 0.4539 | | 2.9447 | 140.0 | 250180 | 3.0139 | 0.4526 | | 2.9449 | 141.0 | 251967 | 3.0163 | 0.4548 | | 2.9439 | 142.0 | 253754 | 3.0308 | 0.4534 | | 2.9435 | 143.0 | 255541 | 3.0242 | 0.4579 | | 2.943 | 144.0 | 257328 | 3.0437 | 0.4513 | | 2.943 | 145.0 | 259115 | 3.0227 | 0.4544 | | 2.9403 | 146.0 | 260902 | 3.0464 | 0.4478 | | 2.9407 | 147.0 | 262689 | 3.0718 | 0.4465 | | 2.9397 | 148.0 | 264476 | 3.0519 | 0.4487 | | 2.9392 | 149.0 | 266263 | 3.0163 | 0.4558 | | 2.9377 | 150.0 | 268050 | 3.0159 | 0.4518 | | 2.9386 | 151.0 | 269837 | 3.0010 | 0.4545 | | 2.9391 | 152.0 | 271624 | 3.0346 | 0.4530 | | 2.9364 | 153.0 | 273411 | 3.0039 | 0.4541 | | 2.9359 | 154.0 | 275198 | 3.0417 | 0.4519 | | 2.9359 | 155.0 | 276985 | 3.0161 | 0.4544 | | 2.936 | 156.0 | 278772 | 3.0169 | 0.4534 | | 2.9329 | 157.0 | 280559 | 3.0594 | 0.4478 | | 2.9336 | 158.0 | 282346 | 3.0265 | 0.4555 | | 2.9341 | 159.0 | 284133 | 3.0276 | 0.4542 | | 2.933 | 160.0 | 285920 | 3.0324 | 0.4524 | | 2.9325 | 161.0 | 287707 | 3.0249 | 0.4489 | | 2.932 | 162.0 | 289494 | 3.0444 | 0.4519 | | 2.9334 | 163.0 | 291281 | 3.0420 | 0.4494 | | 2.9318 | 164.0 | 293068 | 2.9972 | 0.4541 | | 2.9316 | 165.0 | 294855 | 2.9973 | 0.4526 | | 2.9318 | 166.0 | 296642 | 3.0389 | 0.4529 | | 2.9301 | 167.0 | 298429 | 3.0131 | 0.4557 | | 2.9291 | 167.88 | 300000 | 3.0067 | 0.4548 | ### Framework versions - Transformers 4.26.0 - Pytorch 1.14.0a0+410ce96 - Datasets 2.8.0 - Tokenizers 0.13.2