roberta-tiny-10M

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 2.7391
  • Accuracy: 0.5148

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0004
  • train_batch_size: 16
  • eval_batch_size: 32
  • seed: 42
  • gradient_accumulation_steps: 32
  • total_train_batch_size: 512
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 100.0
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Accuracy
7.8031 1.04 50 7.3560 0.0606
7.1948 2.08 100 6.7374 0.1182
6.8927 3.12 150 6.5022 0.1415
6.7339 4.16 200 6.4005 0.1483
6.6609 5.21 250 6.3535 0.1510
6.1972 6.25 300 6.3324 0.1519
6.1685 7.29 350 6.3029 0.1528
6.1302 8.33 400 6.2828 0.1521
6.093 9.37 450 6.2568 0.1536
6.0543 10.41 500 6.2430 0.1544
6.0479 11.45 550 6.2346 0.1541
6.0372 12.49 600 6.2232 0.1546
6.0127 13.53 650 6.2139 0.1541
5.968 14.58 700 6.2053 0.1547
5.9635 15.62 750 6.1996 0.1549
5.9479 16.66 800 6.1953 0.1548
5.9371 17.7 850 6.1887 0.1545
5.9046 18.74 900 6.1613 0.1545
5.8368 19.78 950 6.0952 0.1557
5.7914 20.82 1000 6.0330 0.1569
5.7026 21.86 1050 5.9430 0.1612
5.491 22.9 1100 5.6100 0.1974
4.9289 23.95 1150 4.9607 0.2702
4.5214 24.99 1200 4.5795 0.3051
4.5663 26.04 1250 4.3454 0.3265
4.3717 27.08 1300 4.1738 0.3412
4.1483 28.12 1350 4.0336 0.3555
3.9988 29.16 1400 3.9180 0.3677
3.8695 30.21 1450 3.8108 0.3782
3.5017 31.25 1500 3.7240 0.3879
3.4311 32.29 1550 3.6426 0.3974
3.3517 33.33 1600 3.5615 0.4068
3.2856 34.37 1650 3.4915 0.4156
3.227 35.41 1700 3.4179 0.4255
3.1675 36.45 1750 3.3636 0.4325
3.0908 37.49 1800 3.3083 0.4394
3.0561 38.53 1850 3.2572 0.4473
3.0139 39.58 1900 3.2159 0.4525
2.9837 40.62 1950 3.1789 0.4575
2.9387 41.66 2000 3.1431 0.4618
2.9034 42.7 2050 3.1163 0.4654
2.8822 43.74 2100 3.0842 0.4694
2.836 44.78 2150 3.0583 0.4727
2.8129 45.82 2200 3.0359 0.4760
2.7733 46.86 2250 3.0173 0.4776
2.7589 47.9 2300 2.9978 0.4812
2.7378 48.95 2350 2.9788 0.4831
2.7138 49.99 2400 2.9674 0.4844
2.8692 51.04 2450 2.9476 0.4874
2.8462 52.08 2500 2.9342 0.4893
2.8312 53.12 2550 2.9269 0.4900
2.7834 54.16 2600 2.9111 0.4917
2.7822 55.21 2650 2.8987 0.4934
2.584 56.25 2700 2.8844 0.4949
2.5668 57.29 2750 2.8808 0.4965
2.5536 58.33 2800 2.8640 0.4982
2.5403 59.37 2850 2.8606 0.4982
2.5294 60.41 2900 2.8441 0.5008
2.513 61.45 2950 2.8402 0.5013
2.5105 62.49 3000 2.8316 0.5022
2.4897 63.53 3050 2.8237 0.5027
2.4974 64.58 3100 2.8187 0.5040
2.4799 65.62 3150 2.8129 0.5044
2.4741 66.66 3200 2.8056 0.5057
2.4582 67.7 3250 2.8025 0.5061
2.4389 68.74 3300 2.7913 0.5076
2.4539 69.78 3350 2.7881 0.5072
2.4252 70.82 3400 2.7884 0.5082
2.4287 71.86 3450 2.7784 0.5093
2.4131 72.9 3500 2.7782 0.5099
2.4016 73.95 3550 2.7724 0.5098
2.3998 74.99 3600 2.7659 0.5111
2.5475 76.04 3650 2.7650 0.5108
2.5443 77.08 3700 2.7620 0.5117
2.5381 78.12 3750 2.7631 0.5115
2.5269 79.16 3800 2.7578 0.5122
2.5288 80.21 3850 2.7540 0.5124
2.3669 81.25 3900 2.7529 0.5125
2.3631 82.29 3950 2.7498 0.5132
2.3499 83.33 4000 2.7454 0.5136
2.3726 84.37 4050 2.7446 0.5141
2.3411 85.41 4100 2.7403 0.5144
2.3321 86.45 4150 2.7372 0.5146
2.3456 87.49 4200 2.7389 0.5146
2.3372 88.53 4250 2.7384 0.5151
2.343 89.58 4300 2.7398 0.5144

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.6.1
  • Tokenizers 0.12.1
Downloads last month
35
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.