gokuls's picture
update model card README.md
d8ce2d6
|
raw
history blame
No virus
12.7 kB
metadata
license: apache-2.0
tags:
  - generated_from_trainer
datasets:
  - wikitext
metrics:
  - accuracy
model-index:
  - name: mobilebert_sa_pre-training-complete
    results:
      - task:
          name: Masked Language Modeling
          type: fill-mask
        dataset:
          name: wikitext wikitext-103-raw-v1
          type: wikitext
          args: wikitext-103-raw-v1
        metrics:
          - name: Accuracy
            type: accuracy
            value: 0.6427141769796747

mobilebert_sa_pre-training-complete

This model is a fine-tuned version of google/mobilebert-uncased on the wikitext wikitext-103-raw-v1 dataset. It achieves the following results on the evaluation set:

  • Loss: nan
  • Accuracy: 0.6427

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 64
  • eval_batch_size: 64
  • seed: 10
  • distributed_type: multi-GPU
  • num_devices: 2
  • total_train_batch_size: 128
  • total_eval_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 100
  • training_steps: 300000
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
0.0 1.0 1787 nan 0.6390
0.0 2.0 3574 nan 0.6426
0.0 3.0 5361 nan 0.6415
0.0 4.0 7148 nan 0.6340
0.0 5.0 8935 nan 0.6360
0.0 6.0 10722 nan 0.6341
0.0 7.0 12509 nan 0.6378
0.0 8.0 14296 nan 0.6335
0.0 9.0 16083 nan 0.6363
0.0 10.0 17870 nan 0.6383
0.0 11.0 19657 nan 0.6379
0.0 12.0 21444 nan 0.6346
0.0006 13.0 23231 nan 0.6409
0.0 14.0 25018 nan 0.6406
0.0 15.0 26805 nan 0.6323
0.0 16.0 28592 nan 0.6402
0.0 17.0 30379 nan 0.6400
0.0 18.0 32166 nan 0.6328
0.0 19.0 33953 nan 0.6352
0.0 20.0 35740 nan 0.6380
0.0 21.0 37527 nan 0.6463
0.0 22.0 39314 nan 0.6313
0.0 23.0 41101 nan 0.6386
0.0 24.0 42888 nan 0.6413
0.0 25.0 44675 nan 0.6323
0.0008 26.0 46462 nan 0.6359
0.0 27.0 48249 nan 0.6397
0.0 28.0 50036 nan 0.6377
0.0 29.0 51823 nan 0.6383
0.0 30.0 53610 nan 0.6374
0.0 31.0 55397 nan 0.6476
0.0 32.0 57184 nan 0.6305
0.0011 33.0 58971 nan 0.6451
0.0 34.0 60758 nan 0.6372
0.0 35.0 62545 nan 0.6368
0.0006 36.0 64332 nan 0.6385
0.0 37.0 66119 nan 0.6349
0.0 38.0 67906 nan 0.6334
0.0 39.0 69693 nan 0.6391
0.0 40.0 71480 nan 0.6345
0.0 41.0 73267 nan 0.6423
0.0 42.0 75054 nan 0.6375
0.0 43.0 76841 nan 0.6292
0.0 44.0 78628 nan 0.6337
0.0 45.0 80415 nan 0.6451
0.0 46.0 82202 nan 0.6376
0.0 47.0 83989 nan 0.6355
0.0 48.0 85776 nan 0.6411
0.0 49.0 87563 nan 0.6358
0.0 50.0 89350 nan 0.6428
0.0 51.0 91137 nan 0.6421
0.004 52.0 92924 nan 0.6352
0.0 53.0 94711 nan 0.6411
0.0 54.0 96498 nan 0.6377
0.0 55.0 98285 nan 0.6375
0.0 56.0 100072 nan 0.6368
0.0 57.0 101859 nan 0.6365
0.0 58.0 103646 nan 0.6413
0.0 59.0 105433 nan 0.6347
0.0 60.0 107220 nan 0.6407
0.0 61.0 109007 nan 0.6395
0.0 62.0 110794 nan 0.6373
0.0 63.0 112581 nan 0.6356
0.0 64.0 114368 nan 0.6367
0.0 65.0 116155 nan 0.6441
0.0017 66.0 117942 nan 0.6380
0.0 67.0 119729 nan 0.6348
0.0 68.0 121516 nan 0.6356
0.0 69.0 123303 nan 0.6391
0.0006 70.0 125090 nan 0.6362
0.0 71.0 126877 nan 0.6388
0.0 72.0 128664 nan 0.6354
0.0 73.0 130451 nan 0.6362
0.0013 74.0 132238 nan 0.6347
0.0 75.0 134025 nan 0.6327
0.0 76.0 135812 nan 0.6382
0.0 77.0 137599 nan 0.6411
0.0 78.0 139386 nan 0.6404
0.0 79.0 141173 nan 0.6392
0.0 80.0 142960 nan 0.6404
0.0 81.0 144747 nan 0.6421
0.0 82.0 146534 nan 0.6364
0.0 83.0 148321 nan 0.6364
0.0 84.0 150108 nan 0.6370
0.0 85.0 151895 nan 0.6357
0.0 86.0 153682 nan 0.6353
0.0 87.0 155469 nan 0.6393
0.0 88.0 157256 nan 0.6397
0.0006 89.0 159043 nan 0.6396
0.0013 90.0 160830 nan 0.6378
0.0 91.0 162617 nan 0.6386
0.0 92.0 164404 nan 0.6415
0.0 93.0 166191 nan 0.6342
0.0 94.0 167978 nan 0.6356
0.0 95.0 169765 nan 0.6410
0.0 96.0 171552 nan 0.6366
0.0 97.0 173339 nan 0.6329
0.0013 98.0 175126 nan 0.6352
0.0 99.0 176913 nan 0.6340
0.0 100.0 178700 nan 0.6358
0.0 101.0 180487 nan 0.6367
0.0006 102.0 182274 nan 0.6368
0.0 103.0 184061 nan 0.6353
0.0 104.0 185848 nan 0.6370
0.0 105.0 187635 nan 0.6333
0.0 106.0 189422 nan 0.6316
0.0006 107.0 191209 nan 0.6394
0.0 108.0 192996 nan 0.6323
0.0 109.0 194783 nan 0.6406
0.0012 110.0 196570 nan 0.6331
0.0 111.0 198357 nan 0.6398
0.0 112.0 200144 nan 0.6402
0.0 113.0 201931 nan 0.6345
0.0 114.0 203718 nan 0.6416
0.0 115.0 205505 nan 0.6352
0.0 116.0 207292 nan 0.6357
0.0032 117.0 209079 nan 0.6358
0.0013 118.0 210866 nan 0.6406
0.0 119.0 212653 nan 0.6354
0.0 120.0 214440 nan 0.6345
0.0 121.0 216227 nan 0.6433
0.0 122.0 218014 nan 0.6326
0.0 123.0 219801 nan 0.6358
0.0 124.0 221588 nan 0.6409
0.0 125.0 223375 nan 0.6405
0.0 126.0 225162 nan 0.6376
0.0 127.0 226949 nan 0.6396
0.0 128.0 228736 nan 0.6356
0.0 129.0 230523 nan 0.6432
0.0 130.0 232310 nan 0.6385
0.0 131.0 234097 nan 0.6337
0.0 132.0 235884 nan 0.6390
0.0 133.0 237671 nan 0.6362
0.0 134.0 239458 nan 0.6332
0.0 135.0 241245 nan 0.6367
0.0016 136.0 243032 nan 0.6334
0.0 137.0 244819 nan 0.6412
0.0 138.0 246606 nan 0.6367
0.0 139.0 248393 nan 0.6378
0.0 140.0 250180 nan 0.6390
0.0 141.0 251967 nan 0.6376
0.0 142.0 253754 nan 0.6363
0.0033 143.0 255541 nan 0.6425
0.0 144.0 257328 nan 0.6360
0.0 145.0 259115 nan 0.6377
0.0 146.0 260902 nan 0.6302
0.0 147.0 262689 nan 0.6320
0.0 148.0 264476 nan 0.6358
0.0 149.0 266263 nan 0.6381
0.0 150.0 268050 nan 0.6414
0.0 151.0 269837 nan 0.6401
0.0012 152.0 271624 nan 0.6415
0.0 153.0 273411 nan 0.6425
0.0 154.0 275198 nan 0.6367
0.0 155.0 276985 nan 0.6356
0.0 156.0 278772 nan 0.6411
0.0 157.0 280559 nan 0.6343
0.0007 158.0 282346 nan 0.6369
0.0 159.0 284133 nan 0.6361
0.0013 160.0 285920 nan 0.6396
0.0008 161.0 287707 nan 0.6381
0.0 162.0 289494 nan 0.6352
0.0 163.0 291281 nan 0.6370
0.0 164.0 293068 nan 0.6399
0.0031 165.0 294855 nan 0.6401
0.0 166.0 296642 nan 0.6358
0.0 167.0 298429 nan 0.6390
0.0 167.88 300000 nan 0.6354

Framework versions

  • Transformers 4.25.1
  • Pytorch 1.14.0a0+410ce96
  • Datasets 2.8.0
  • Tokenizers 0.13.2