roberta-tiny-10M / README.md
g8a9's picture
update model card README.md
ad20f27
|
raw
history blame
8.79 kB
metadata
tags:
  - generated_from_trainer
metrics:
  - accuracy
model-index:
  - name: roberta-tiny-10M
    results: []

roberta-tiny-10M

This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 6.4832
  • Accuracy: 0.1379

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 32
  • eval_batch_size: 64
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 128
  • optimizer: Adam with betas=(0.9,0.98) and epsilon=1e-08
  • lr_scheduler_type: constant_with_warmup
  • lr_scheduler_warmup_steps: 50
  • num_epochs: 40.0

Training results

Training Loss Epoch Step Validation Loss Accuracy
10.5287 0.26 50 10.4478 0.0481
10.0073 0.51 100 9.9550 0.0488
9.6268 0.77 150 9.5865 0.0488
9.9837 1.03 200 9.2502 0.0471
8.9701 1.29 250 8.9370 0.0466
8.6689 1.54 300 8.6447 0.0473
8.3893 1.8 350 8.3794 0.0473
8.1697 2.06 400 8.1342 0.0506
7.926 2.32 450 7.9221 0.0617
7.7329 2.58 500 7.7398 0.0627
7.582 2.83 550 7.5844 0.0691
7.4419 3.09 600 7.4620 0.0729
7.3658 3.35 650 7.3735 0.0781
7.2857 3.61 700 7.3049 0.0801
7.224 3.86 750 7.2554 0.0831
7.1851 4.12 800 7.2082 0.0853
7.1327 4.38 850 7.1678 0.0878
7.0947 4.64 900 7.1326 0.0909
7.0761 4.89 950 7.1069 0.0919
7.0551 5.15 1000 7.0806 0.0943
7.0389 5.41 1050 7.0588 0.0952
7.0226 5.67 1100 7.0379 0.0964
6.9992 5.92 1150 7.0142 0.0975
6.9382 6.18 1200 6.9979 0.0986
6.956 6.44 1250 6.9828 0.0987
6.9425 6.7 1300 6.9619 0.1008
6.8872 6.96 1350 6.9468 0.1014
6.8848 7.22 1400 6.9320 0.1024
6.8578 7.47 1450 6.9190 0.1039
6.8699 7.73 1500 6.9022 0.1050
6.8402 7.99 1550 6.8910 0.1057
6.8172 8.25 1600 6.8730 0.1069
6.823 8.5 1650 6.8662 0.1073
6.8028 8.76 1700 6.8487 0.1082
7.3146 9.02 1750 6.8400 0.1083
6.8014 9.28 1800 6.8303 0.1092
6.8028 9.53 1850 6.8226 0.1088
6.7817 9.79 1900 6.8079 0.1107
7.28 10.05 1950 6.8021 0.1115
6.7624 10.31 2000 6.7930 0.1118
6.7416 10.56 2050 6.7868 0.1124
6.7288 10.82 2100 6.7805 0.1133
6.7468 11.08 2150 6.7720 0.1123
6.7387 11.34 2200 6.7636 0.1135
6.7242 11.6 2250 6.7557 0.1134
6.702 11.85 2300 6.7496 0.1141
6.6662 12.11 2350 6.7433 0.1150
6.6781 12.37 2400 6.7362 0.1148
6.6743 12.63 2450 6.7275 0.1161
6.6843 12.88 2500 6.7247 0.1165
6.6726 13.14 2550 6.7127 0.1173
6.6656 13.4 2600 6.7098 0.1170
6.6428 13.66 2650 6.7019 0.1185
6.6355 13.91 2700 6.6979 0.1175
6.6521 14.17 2750 6.6923 0.1188
6.6735 14.43 2800 6.6842 0.1186
6.6151 14.69 2850 6.6791 0.1195
6.6248 14.94 2900 6.6752 0.1198
6.6427 15.21 2950 6.6665 0.1207
6.5947 15.46 3000 6.6639 0.1207
6.6199 15.72 3050 6.6598 0.1217
6.6127 15.98 3100 6.6593 0.1219
6.6031 16.24 3150 6.6512 0.1226
6.5742 16.49 3200 6.6485 0.1227
6.621 16.75 3250 6.6472 0.1221
7.0655 17.01 3300 6.6369 0.1232
6.5866 17.27 3350 6.6376 0.1234
6.6098 17.52 3400 6.6313 0.1252
6.5676 17.78 3450 6.6254 0.1248
7.0636 18.04 3500 6.6226 0.1256
6.5444 18.3 3550 6.6164 0.1253
6.561 18.55 3600 6.6157 0.1254
6.5882 18.81 3650 6.6072 0.1257
6.5518 19.07 3700 6.6064 0.1267
6.5599 19.33 3750 6.6055 0.1271
6.5407 19.59 3800 6.5987 0.1274
6.5373 19.84 3850 6.5954 0.1280
6.5381 20.1 3900 6.5899 0.1282
6.5517 20.36 3950 6.5888 0.1283
6.5371 20.62 4000 6.5854 0.1295
6.5819 20.87 4050 6.5825 0.1282
6.5425 21.13 4100 6.5794 0.1289
6.5372 21.39 4150 6.5760 0.1300
6.544 21.65 4200 6.5718 0.1303
6.5129 21.9 4250 6.5660 0.1310
6.4798 22.16 4300 6.5682 0.1305
6.5556 22.42 4350 6.5619 0.1315
6.4946 22.68 4400 6.5589 0.1314
6.5212 22.93 4450 6.5593 0.1318
6.5055 23.2 4500 6.5552 0.1311
6.4693 23.45 4550 6.5481 0.1325
6.4706 23.71 4600 6.5469 0.1317
6.495 23.97 4650 6.5462 0.1324
6.4901 24.23 4700 6.5414 0.1328
6.4936 24.48 4750 6.5385 0.1334
6.481 24.74 4800 6.5362 0.1331
6.5186 25.0 4850 6.5357 0.1335
6.4711 25.26 4900 6.5309 0.1339
6.4513 25.51 4950 6.5284 0.1337
6.4652 25.77 5000 6.5242 0.1343
6.9335 26.03 5050 6.5217 0.1345
6.4747 26.29 5100 6.5206 0.1345
6.4702 26.54 5150 6.5201 0.1350
6.4524 26.8 5200 6.5156 0.1352
6.4225 27.06 5250 6.5150 0.1349
6.4599 27.32 5300 6.5116 0.1355
6.4591 27.58 5350 6.5098 0.1358
6.4184 27.83 5400 6.5096 0.1353
6.43 28.09 5450 6.5074 0.1361
6.4604 28.35 5500 6.4999 0.1367
6.4593 28.61 5550 6.4994 0.1359
6.4648 28.86 5600 6.4981 0.1356
6.4453 29.12 5650 6.4949 0.1374
6.4275 29.38 5700 6.4954 0.1362
6.4165 29.64 5750 6.4938 0.1369
6.4211 29.89 5800 6.4911 0.1376
6.4188 30.15 5850 6.4860 0.1374
6.4337 30.41 5900 6.4807 0.1380
6.4228 30.67 5950 6.4876 0.1375
6.3841 30.92 6000 6.4811 0.1376
6.4383 31.18 6050 6.4832 0.1379

Framework versions

  • Transformers 4.24.0
  • Pytorch 1.11.0+cu113
  • Datasets 2.6.1
  • Tokenizers 0.12.1