BERT_WordPiece_phonetic_wikitext
This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.6469
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.0001
- train_batch_size: 256
- eval_batch_size: 256
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
- lr_scheduler_type: linear
- lr_scheduler_warmup_steps: 10000
- training_steps: 400000
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss |
---|---|---|---|
4.8449 | 0.3019 | 2000 | 4.7637 |
4.6361 | 0.6039 | 4000 | 4.5490 |
2.4728 | 0.9058 | 6000 | 2.3238 |
2.0267 | 1.2077 | 8000 | 1.8748 |
1.815 | 1.5097 | 10000 | 1.6849 |
1.6524 | 1.8116 | 12000 | 1.5375 |
1.5396 | 2.1135 | 14000 | 1.4241 |
1.4813 | 2.4155 | 16000 | 1.3695 |
1.4192 | 2.7174 | 18000 | 1.3086 |
1.3677 | 3.0193 | 20000 | 1.2793 |
1.326 | 3.3213 | 22000 | 1.2260 |
1.2983 | 3.6232 | 24000 | 1.2209 |
1.2697 | 3.9251 | 26000 | 1.1787 |
1.244 | 4.2271 | 28000 | 1.1685 |
1.2127 | 4.5290 | 30000 | 1.1367 |
1.1989 | 4.8309 | 32000 | 1.1507 |
1.1649 | 5.1329 | 34000 | 1.1021 |
1.1671 | 5.4348 | 36000 | 1.0730 |
1.1501 | 5.7367 | 38000 | 1.0527 |
1.1351 | 6.0386 | 40000 | 1.0594 |
1.1198 | 6.3406 | 42000 | 1.0574 |
1.1023 | 6.6425 | 44000 | 1.0181 |
1.0947 | 6.9444 | 46000 | 1.0454 |
1.0677 | 7.2464 | 48000 | 1.0026 |
1.0714 | 7.5483 | 50000 | 1.0084 |
1.0671 | 7.8502 | 52000 | 0.9860 |
1.0606 | 8.1522 | 54000 | 0.9996 |
1.0427 | 8.4541 | 56000 | 0.9781 |
1.0336 | 8.7560 | 58000 | 0.9616 |
1.0123 | 9.0580 | 60000 | 0.9639 |
1.019 | 9.3599 | 62000 | 0.9565 |
1.0089 | 9.6618 | 64000 | 0.9411 |
1.0068 | 9.9638 | 66000 | 0.9316 |
0.9888 | 10.2657 | 68000 | 0.9545 |
0.9934 | 10.5676 | 70000 | 0.9429 |
0.9796 | 10.8696 | 72000 | 0.9231 |
0.9718 | 11.1715 | 74000 | 0.9135 |
0.9711 | 11.4734 | 76000 | 0.9146 |
0.963 | 11.7754 | 78000 | 0.9224 |
0.9539 | 12.0773 | 80000 | 0.8856 |
0.9582 | 12.3792 | 82000 | 0.8898 |
0.9438 | 12.6812 | 84000 | 0.9079 |
0.9511 | 12.9831 | 86000 | 0.8911 |
0.9376 | 13.2850 | 88000 | 0.8880 |
0.9413 | 13.5870 | 90000 | 0.8965 |
0.9287 | 13.8889 | 92000 | 0.8797 |
0.9201 | 14.1908 | 94000 | 0.8681 |
0.9238 | 14.4928 | 96000 | 0.8691 |
0.9218 | 14.7947 | 98000 | 0.8558 |
0.9114 | 15.0966 | 100000 | 0.8628 |
0.91 | 15.3986 | 102000 | 0.8666 |
0.9095 | 15.7005 | 104000 | 0.8554 |
0.9059 | 16.0024 | 106000 | 0.8580 |
0.8975 | 16.3043 | 108000 | 0.8411 |
0.8896 | 16.6063 | 110000 | 0.8460 |
0.8914 | 16.9082 | 112000 | 0.8535 |
0.8885 | 17.2101 | 114000 | 0.8272 |
0.888 | 17.5121 | 116000 | 0.8320 |
0.8809 | 17.8140 | 118000 | 0.8241 |
0.8701 | 18.1159 | 120000 | 0.8547 |
0.8745 | 18.4179 | 122000 | 0.8292 |
0.8673 | 18.7198 | 124000 | 0.8275 |
0.8618 | 19.0217 | 126000 | 0.8190 |
0.8592 | 19.3237 | 128000 | 0.8211 |
0.8626 | 19.6256 | 130000 | 0.8161 |
0.8597 | 19.9275 | 132000 | 0.8414 |
0.8528 | 20.2295 | 134000 | 0.7901 |
0.8556 | 20.5314 | 136000 | 0.7929 |
0.851 | 20.8333 | 138000 | 0.8142 |
0.8414 | 21.1353 | 140000 | 0.7967 |
0.8372 | 21.4372 | 142000 | 0.8091 |
0.8502 | 21.7391 | 144000 | 0.8092 |
0.8311 | 22.0411 | 146000 | 0.8219 |
0.8308 | 22.3430 | 148000 | 0.7843 |
0.8342 | 22.6449 | 150000 | 0.7971 |
0.824 | 22.9469 | 152000 | 0.7847 |
0.8297 | 23.2488 | 154000 | 0.7925 |
0.8237 | 23.5507 | 156000 | 0.8024 |
0.8233 | 23.8527 | 158000 | 0.7917 |
0.8058 | 24.1546 | 160000 | 0.7776 |
0.8153 | 24.4565 | 162000 | 0.7700 |
0.8193 | 24.7585 | 164000 | 0.7678 |
0.8086 | 25.0604 | 166000 | 0.7836 |
0.8098 | 25.3623 | 168000 | 0.7850 |
0.8088 | 25.6643 | 170000 | 0.7515 |
0.804 | 25.9662 | 172000 | 0.7814 |
0.7978 | 26.2681 | 174000 | 0.7883 |
0.7997 | 26.5700 | 176000 | 0.7577 |
0.7996 | 26.8720 | 178000 | 0.7628 |
0.7958 | 27.1739 | 180000 | 0.7642 |
0.7928 | 27.4758 | 182000 | 0.7836 |
0.7889 | 27.7778 | 184000 | 0.7373 |
0.7833 | 28.0797 | 186000 | 0.7536 |
0.7823 | 28.3816 | 188000 | 0.7645 |
0.7822 | 28.6836 | 190000 | 0.7438 |
0.7841 | 28.9855 | 192000 | 0.7497 |
0.7768 | 29.2874 | 194000 | 0.7515 |
0.779 | 29.5894 | 196000 | 0.7566 |
0.7805 | 29.8913 | 198000 | 0.7699 |
0.7634 | 30.1932 | 200000 | 0.7340 |
0.773 | 30.4952 | 202000 | 0.7349 |
0.7667 | 30.7971 | 204000 | 0.7544 |
0.7644 | 31.0990 | 206000 | 0.7570 |
0.7661 | 31.4010 | 208000 | 0.7383 |
0.7625 | 31.7029 | 210000 | 0.7371 |
0.7591 | 32.0048 | 212000 | 0.7335 |
0.767 | 32.3068 | 214000 | 0.7306 |
0.768 | 32.6087 | 216000 | 0.7269 |
0.7587 | 32.9106 | 218000 | 0.7168 |
0.7517 | 33.2126 | 220000 | 0.7432 |
0.7508 | 33.5145 | 222000 | 0.7355 |
0.7534 | 33.8164 | 224000 | 0.7385 |
0.7453 | 34.1184 | 226000 | 0.7339 |
0.746 | 34.4203 | 228000 | 0.6993 |
0.7564 | 34.7222 | 230000 | 0.7269 |
0.7423 | 35.0242 | 232000 | 0.7326 |
0.7424 | 35.3261 | 234000 | 0.7287 |
0.7434 | 35.6280 | 236000 | 0.7118 |
0.7392 | 35.9300 | 238000 | 0.7102 |
0.7357 | 36.2319 | 240000 | 0.7108 |
0.7381 | 36.5338 | 242000 | 0.7382 |
0.7363 | 36.8357 | 244000 | 0.7135 |
0.7322 | 37.1377 | 246000 | 0.7133 |
0.732 | 37.4396 | 248000 | 0.7209 |
0.7293 | 37.7415 | 250000 | 0.7019 |
0.7257 | 38.0435 | 252000 | 0.7270 |
0.7271 | 38.3454 | 254000 | 0.6991 |
0.7245 | 38.6473 | 256000 | 0.7175 |
0.7278 | 38.9493 | 258000 | 0.7165 |
0.7162 | 39.2512 | 260000 | 0.7210 |
0.7193 | 39.5531 | 262000 | 0.7217 |
0.718 | 39.8551 | 264000 | 0.7101 |
0.7155 | 40.1570 | 266000 | 0.6953 |
0.718 | 40.4589 | 268000 | 0.7105 |
0.7151 | 40.7609 | 270000 | 0.6951 |
0.7126 | 41.0628 | 272000 | 0.7000 |
0.7126 | 41.3647 | 274000 | 0.6992 |
0.7256 | 41.6667 | 276000 | 0.6926 |
0.7167 | 41.9686 | 278000 | 0.6942 |
0.7126 | 42.2705 | 280000 | 0.6999 |
0.706 | 42.5725 | 282000 | 0.7003 |
0.7102 | 42.8744 | 284000 | 0.6942 |
0.6976 | 43.1763 | 286000 | 0.6940 |
0.6947 | 43.4783 | 288000 | 0.6691 |
0.6988 | 43.7802 | 290000 | 0.6939 |
0.7021 | 44.0821 | 292000 | 0.7011 |
0.6957 | 44.3841 | 294000 | 0.6926 |
0.7028 | 44.6860 | 296000 | 0.6983 |
0.7001 | 44.9879 | 298000 | 0.6988 |
0.6874 | 45.2899 | 300000 | 0.6777 |
0.7012 | 45.5918 | 302000 | 0.6692 |
0.6942 | 45.8937 | 304000 | 0.6814 |
0.6882 | 46.1957 | 306000 | 0.6787 |
0.6845 | 46.4976 | 308000 | 0.7082 |
0.6845 | 46.7995 | 310000 | 0.6953 |
0.6896 | 47.1014 | 312000 | 0.6762 |
0.6897 | 47.4034 | 314000 | 0.6792 |
0.6864 | 47.7053 | 316000 | 0.6813 |
0.6769 | 48.0072 | 318000 | 0.6913 |
0.6847 | 48.3092 | 320000 | 0.6646 |
0.6854 | 48.6111 | 322000 | 0.6798 |
0.6795 | 48.9130 | 324000 | 0.6799 |
0.6763 | 49.2150 | 326000 | 0.6840 |
0.6732 | 49.5169 | 328000 | 0.6873 |
0.6703 | 49.8188 | 330000 | 0.6834 |
0.6693 | 50.1208 | 332000 | 0.6622 |
0.6787 | 50.4227 | 334000 | 0.6655 |
0.6742 | 50.7246 | 336000 | 0.6799 |
0.6723 | 51.0266 | 338000 | 0.6652 |
0.6712 | 51.3285 | 340000 | 0.6633 |
0.6764 | 51.6304 | 342000 | 0.6853 |
0.67 | 51.9324 | 344000 | 0.6687 |
0.6616 | 52.2343 | 346000 | 0.6556 |
0.6653 | 52.5362 | 348000 | 0.6619 |
0.6616 | 52.8382 | 350000 | 0.6809 |
0.6659 | 53.1401 | 352000 | 0.6610 |
0.6594 | 53.4420 | 354000 | 0.6635 |
0.665 | 53.7440 | 356000 | 0.6652 |
0.6593 | 54.0459 | 358000 | 0.6542 |
0.6519 | 54.3478 | 360000 | 0.6522 |
0.6561 | 54.6498 | 362000 | 0.6641 |
0.6558 | 54.9517 | 364000 | 0.6645 |
0.6579 | 55.2536 | 366000 | 0.6593 |
0.6544 | 55.5556 | 368000 | 0.6643 |
0.6557 | 55.8575 | 370000 | 0.6646 |
0.6626 | 56.1594 | 372000 | 0.6601 |
0.6534 | 56.4614 | 374000 | 0.6461 |
0.6579 | 56.7633 | 376000 | 0.6671 |
0.6536 | 57.0652 | 378000 | 0.6582 |
0.658 | 57.3671 | 380000 | 0.6430 |
0.6522 | 57.6691 | 382000 | 0.6492 |
0.6486 | 57.9710 | 384000 | 0.6628 |
0.6532 | 58.2729 | 386000 | 0.6382 |
0.6472 | 58.5749 | 388000 | 0.6563 |
0.6477 | 58.8768 | 390000 | 0.6560 |
0.6429 | 59.1787 | 392000 | 0.6653 |
0.6507 | 59.4807 | 394000 | 0.6525 |
0.6438 | 59.7826 | 396000 | 0.6609 |
0.6424 | 60.0845 | 398000 | 0.6566 |
0.6422 | 60.3865 | 400000 | 0.6469 |
Framework versions
- Transformers 4.45.1
- Pytorch 2.4.1+cu121
- Datasets 3.0.1
- Tokenizers 0.20.0
- Downloads last month
- 169