Edit model card

125m-dalio-book-handwritten-io-constant-1e-6-v2

This model is a fine-tuned version of facebook/opt-125m on the AlekseyKorshuk/dalio-book-handwritten-io-sorted-v2 dataset. It achieves the following results on the evaluation set:

  • Loss: 3.0859
  • Accuracy: 0.2336
  • Perplexity: 21.8880

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 8
  • total_train_batch_size: 8
  • total_eval_batch_size: 8
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: constant
  • num_epochs: 1.0

Training results

Training Loss Epoch Step Validation Loss Accuracy Perplexity
3.3352 0.01 1 3.1738 0.2305 23.8988
3.3091 0.03 2 3.1738 0.2305 23.8988
3.3347 0.04 3 3.1738 0.2305 23.8988
3.1445 0.05 4 3.1738 0.2305 23.8988
2.8918 0.07 5 3.1738 0.2305 23.8988
3.2068 0.08 6 3.1738 0.2305 23.8988
3.6245 0.09 7 3.1719 0.2305 23.8522
3.2256 0.11 8 3.1719 0.2305 23.8522
2.9991 0.12 9 3.1699 0.2305 23.8056
3.3257 0.13 10 3.1680 0.2306 23.7592
3.1199 0.15 11 3.1660 0.2306 23.7128
3.3735 0.16 12 3.1660 0.2306 23.7128
3.0051 0.17 13 3.1641 0.2307 23.6665
3.2695 0.19 14 3.1621 0.2308 23.6204
3.2004 0.2 15 3.1602 0.2309 23.5743
3.2075 0.21 16 3.1582 0.2308 23.5283
3.321 0.23 17 3.1562 0.2308 23.4824
3.4026 0.24 18 3.1543 0.2309 23.4366
3.0383 0.25 19 3.1523 0.2309 23.3908
3.166 0.27 20 3.1504 0.2309 23.3452
3.144 0.28 21 3.1484 0.2310 23.2996
3.1624 0.29 22 3.1484 0.2310 23.2996
3.0332 0.31 23 3.1465 0.2310 23.2542
3.3745 0.32 24 3.1445 0.2311 23.2088
3.0823 0.33 25 3.1426 0.2312 23.1635
3.6021 0.35 26 3.1406 0.2312 23.1183
3.1125 0.36 27 3.1387 0.2313 23.0732
3.1406 0.37 28 3.1387 0.2314 23.0732
3.1736 0.39 29 3.1367 0.2314 23.0282
3.1104 0.4 30 3.1348 0.2315 22.9832
3.1301 0.41 31 3.1328 0.2316 22.9384
3.3376 0.43 32 3.1309 0.2315 22.8936
3.218 0.44 33 3.1309 0.2316 22.8936
3.0786 0.45 34 3.1289 0.2316 22.8490
3.0125 0.47 35 3.1270 0.2317 22.8044
3.2634 0.48 36 3.1270 0.2317 22.8044
2.9888 0.49 37 3.125 0.2318 22.7599
3.1624 0.51 38 3.1230 0.2318 22.7155
2.9807 0.52 39 3.1211 0.2319 22.6712
3.446 0.53 40 3.1211 0.2319 22.6712
3.1338 0.55 41 3.1191 0.2320 22.6269
3.1841 0.56 42 3.1191 0.2320 22.6269
3.1079 0.57 43 3.1172 0.2320 22.5828
3.0918 0.59 44 3.1152 0.2321 22.5387
3.0302 0.6 45 3.1152 0.2322 22.5387
3.1123 0.61 46 3.1133 0.2323 22.4947
2.9985 0.63 47 3.1113 0.2324 22.4508
3.3816 0.64 48 3.1113 0.2324 22.4508
3.0813 0.65 49 3.1094 0.2324 22.4070
3.2024 0.67 50 3.1094 0.2325 22.4070
3.0178 0.68 51 3.1074 0.2325 22.3633
3.1646 0.69 52 3.1074 0.2326 22.3633
3.0046 0.71 53 3.1055 0.2327 22.3197
3.0266 0.72 54 3.1055 0.2327 22.3197
3.3857 0.73 55 3.1035 0.2327 22.2761
3.064 0.75 56 3.1035 0.2328 22.2761
3.176 0.76 57 3.1016 0.2328 22.2327
3.1851 0.77 58 3.1016 0.2329 22.2327
3.0811 0.79 59 3.0996 0.2329 22.1893
3.0205 0.8 60 3.0996 0.2330 22.1893
3.26 0.81 61 3.0977 0.2330 22.1460
3.2922 0.83 62 3.0977 0.2331 22.1460
3.5349 0.84 63 3.0957 0.2331 22.1028
3.3525 0.85 64 3.0957 0.2331 22.1028
3.135 0.87 65 3.0938 0.2331 22.0596
3.1707 0.88 66 3.0938 0.2332 22.0596
3.0127 0.89 67 3.0918 0.2332 22.0166
3.0952 0.91 68 3.0918 0.2332 22.0166
3.1023 0.92 69 3.0898 0.2334 21.9736
3.3821 0.93 70 3.0898 0.2334 21.9736
3.1118 0.95 71 3.0879 0.2334 21.9308
3.1143 0.96 72 3.0879 0.2335 21.9308
3.1118 0.97 73 3.0879 0.2335 21.9308
3.0596 0.99 74 3.0859 0.2336 21.8880
3.1033 1.0 75 3.0859 0.2336 21.8880

Framework versions

  • Transformers 4.25.0.dev0
  • Pytorch 1.12.1+cu113
  • Datasets 2.3.2
  • Tokenizers 0.12.1
Downloads last month
5

Dataset used to train AlekseyKorshuk/125m-dalio-book-handwritten-io-constant-1e-6-v2

Evaluation results

  • Accuracy on AlekseyKorshuk/dalio-book-handwritten-io-sorted-v2
    self-reported
    0.234