BERT_WordPiece_phonetic_wikitext

This model is a fine-tuned version of on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.6469

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-06
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 10000
  • training_steps: 400000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
4.8449 0.3019 2000 4.7637
4.6361 0.6039 4000 4.5490
2.4728 0.9058 6000 2.3238
2.0267 1.2077 8000 1.8748
1.815 1.5097 10000 1.6849
1.6524 1.8116 12000 1.5375
1.5396 2.1135 14000 1.4241
1.4813 2.4155 16000 1.3695
1.4192 2.7174 18000 1.3086
1.3677 3.0193 20000 1.2793
1.326 3.3213 22000 1.2260
1.2983 3.6232 24000 1.2209
1.2697 3.9251 26000 1.1787
1.244 4.2271 28000 1.1685
1.2127 4.5290 30000 1.1367
1.1989 4.8309 32000 1.1507
1.1649 5.1329 34000 1.1021
1.1671 5.4348 36000 1.0730
1.1501 5.7367 38000 1.0527
1.1351 6.0386 40000 1.0594
1.1198 6.3406 42000 1.0574
1.1023 6.6425 44000 1.0181
1.0947 6.9444 46000 1.0454
1.0677 7.2464 48000 1.0026
1.0714 7.5483 50000 1.0084
1.0671 7.8502 52000 0.9860
1.0606 8.1522 54000 0.9996
1.0427 8.4541 56000 0.9781
1.0336 8.7560 58000 0.9616
1.0123 9.0580 60000 0.9639
1.019 9.3599 62000 0.9565
1.0089 9.6618 64000 0.9411
1.0068 9.9638 66000 0.9316
0.9888 10.2657 68000 0.9545
0.9934 10.5676 70000 0.9429
0.9796 10.8696 72000 0.9231
0.9718 11.1715 74000 0.9135
0.9711 11.4734 76000 0.9146
0.963 11.7754 78000 0.9224
0.9539 12.0773 80000 0.8856
0.9582 12.3792 82000 0.8898
0.9438 12.6812 84000 0.9079
0.9511 12.9831 86000 0.8911
0.9376 13.2850 88000 0.8880
0.9413 13.5870 90000 0.8965
0.9287 13.8889 92000 0.8797
0.9201 14.1908 94000 0.8681
0.9238 14.4928 96000 0.8691
0.9218 14.7947 98000 0.8558
0.9114 15.0966 100000 0.8628
0.91 15.3986 102000 0.8666
0.9095 15.7005 104000 0.8554
0.9059 16.0024 106000 0.8580
0.8975 16.3043 108000 0.8411
0.8896 16.6063 110000 0.8460
0.8914 16.9082 112000 0.8535
0.8885 17.2101 114000 0.8272
0.888 17.5121 116000 0.8320
0.8809 17.8140 118000 0.8241
0.8701 18.1159 120000 0.8547
0.8745 18.4179 122000 0.8292
0.8673 18.7198 124000 0.8275
0.8618 19.0217 126000 0.8190
0.8592 19.3237 128000 0.8211
0.8626 19.6256 130000 0.8161
0.8597 19.9275 132000 0.8414
0.8528 20.2295 134000 0.7901
0.8556 20.5314 136000 0.7929
0.851 20.8333 138000 0.8142
0.8414 21.1353 140000 0.7967
0.8372 21.4372 142000 0.8091
0.8502 21.7391 144000 0.8092
0.8311 22.0411 146000 0.8219
0.8308 22.3430 148000 0.7843
0.8342 22.6449 150000 0.7971
0.824 22.9469 152000 0.7847
0.8297 23.2488 154000 0.7925
0.8237 23.5507 156000 0.8024
0.8233 23.8527 158000 0.7917
0.8058 24.1546 160000 0.7776
0.8153 24.4565 162000 0.7700
0.8193 24.7585 164000 0.7678
0.8086 25.0604 166000 0.7836
0.8098 25.3623 168000 0.7850
0.8088 25.6643 170000 0.7515
0.804 25.9662 172000 0.7814
0.7978 26.2681 174000 0.7883
0.7997 26.5700 176000 0.7577
0.7996 26.8720 178000 0.7628
0.7958 27.1739 180000 0.7642
0.7928 27.4758 182000 0.7836
0.7889 27.7778 184000 0.7373
0.7833 28.0797 186000 0.7536
0.7823 28.3816 188000 0.7645
0.7822 28.6836 190000 0.7438
0.7841 28.9855 192000 0.7497
0.7768 29.2874 194000 0.7515
0.779 29.5894 196000 0.7566
0.7805 29.8913 198000 0.7699
0.7634 30.1932 200000 0.7340
0.773 30.4952 202000 0.7349
0.7667 30.7971 204000 0.7544
0.7644 31.0990 206000 0.7570
0.7661 31.4010 208000 0.7383
0.7625 31.7029 210000 0.7371
0.7591 32.0048 212000 0.7335
0.767 32.3068 214000 0.7306
0.768 32.6087 216000 0.7269
0.7587 32.9106 218000 0.7168
0.7517 33.2126 220000 0.7432
0.7508 33.5145 222000 0.7355
0.7534 33.8164 224000 0.7385
0.7453 34.1184 226000 0.7339
0.746 34.4203 228000 0.6993
0.7564 34.7222 230000 0.7269
0.7423 35.0242 232000 0.7326
0.7424 35.3261 234000 0.7287
0.7434 35.6280 236000 0.7118
0.7392 35.9300 238000 0.7102
0.7357 36.2319 240000 0.7108
0.7381 36.5338 242000 0.7382
0.7363 36.8357 244000 0.7135
0.7322 37.1377 246000 0.7133
0.732 37.4396 248000 0.7209
0.7293 37.7415 250000 0.7019
0.7257 38.0435 252000 0.7270
0.7271 38.3454 254000 0.6991
0.7245 38.6473 256000 0.7175
0.7278 38.9493 258000 0.7165
0.7162 39.2512 260000 0.7210
0.7193 39.5531 262000 0.7217
0.718 39.8551 264000 0.7101
0.7155 40.1570 266000 0.6953
0.718 40.4589 268000 0.7105
0.7151 40.7609 270000 0.6951
0.7126 41.0628 272000 0.7000
0.7126 41.3647 274000 0.6992
0.7256 41.6667 276000 0.6926
0.7167 41.9686 278000 0.6942
0.7126 42.2705 280000 0.6999
0.706 42.5725 282000 0.7003
0.7102 42.8744 284000 0.6942
0.6976 43.1763 286000 0.6940
0.6947 43.4783 288000 0.6691
0.6988 43.7802 290000 0.6939
0.7021 44.0821 292000 0.7011
0.6957 44.3841 294000 0.6926
0.7028 44.6860 296000 0.6983
0.7001 44.9879 298000 0.6988
0.6874 45.2899 300000 0.6777
0.7012 45.5918 302000 0.6692
0.6942 45.8937 304000 0.6814
0.6882 46.1957 306000 0.6787
0.6845 46.4976 308000 0.7082
0.6845 46.7995 310000 0.6953
0.6896 47.1014 312000 0.6762
0.6897 47.4034 314000 0.6792
0.6864 47.7053 316000 0.6813
0.6769 48.0072 318000 0.6913
0.6847 48.3092 320000 0.6646
0.6854 48.6111 322000 0.6798
0.6795 48.9130 324000 0.6799
0.6763 49.2150 326000 0.6840
0.6732 49.5169 328000 0.6873
0.6703 49.8188 330000 0.6834
0.6693 50.1208 332000 0.6622
0.6787 50.4227 334000 0.6655
0.6742 50.7246 336000 0.6799
0.6723 51.0266 338000 0.6652
0.6712 51.3285 340000 0.6633
0.6764 51.6304 342000 0.6853
0.67 51.9324 344000 0.6687
0.6616 52.2343 346000 0.6556
0.6653 52.5362 348000 0.6619
0.6616 52.8382 350000 0.6809
0.6659 53.1401 352000 0.6610
0.6594 53.4420 354000 0.6635
0.665 53.7440 356000 0.6652
0.6593 54.0459 358000 0.6542
0.6519 54.3478 360000 0.6522
0.6561 54.6498 362000 0.6641
0.6558 54.9517 364000 0.6645
0.6579 55.2536 366000 0.6593
0.6544 55.5556 368000 0.6643
0.6557 55.8575 370000 0.6646
0.6626 56.1594 372000 0.6601
0.6534 56.4614 374000 0.6461
0.6579 56.7633 376000 0.6671
0.6536 57.0652 378000 0.6582
0.658 57.3671 380000 0.6430
0.6522 57.6691 382000 0.6492
0.6486 57.9710 384000 0.6628
0.6532 58.2729 386000 0.6382
0.6472 58.5749 388000 0.6563
0.6477 58.8768 390000 0.6560
0.6429 59.1787 392000 0.6653
0.6507 59.4807 394000 0.6525
0.6438 59.7826 396000 0.6609
0.6424 60.0845 398000 0.6566
0.6422 60.3865 400000 0.6469

Framework versions

  • Transformers 4.45.1
  • Pytorch 2.4.1+cu121
  • Datasets 3.0.1
  • Tokenizers 0.20.0
Downloads last month
169
Safetensors
Model size
110M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.