Edit model card

byt5-base-indocollex-informal-to-formal-wordformation

This model is a fine-tuned version of google/byt5-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1413
  • Cer: 0.1978
  • Wer: 0.4524
  • Word Acc: 0.5476
  • Gen Len: 7.6457

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss Cer Wer Word Acc Gen Len
No log 0.54 50 16.1894 2.1868 2.2905 -1.2905 19.0
No log 1.08 100 13.7479 2.1248 1.9333 -0.9333 19.0
No log 1.61 150 11.6231 2.1095 1.4238 -0.4238 18.7486
No log 2.15 200 8.9106 1.056 0.9857 0.0143 10.6171
No log 2.69 250 4.6844 0.8523 0.9762 0.0238 9.36
No log 3.23 300 4.1175 0.5756 0.9714 0.0286 7.4114
No log 3.76 350 3.3688 0.5951 0.9714 0.0286 7.8
No log 4.3 400 2.2287 0.6112 0.9857 0.0143 6.7543
No log 4.84 450 1.5164 0.6095 0.9571 0.0429 7.8857
8.4834 5.38 500 1.0363 0.5976 0.9476 0.0524 7.8229
8.4834 5.91 550 0.6893 0.5976 0.9476 0.0524 7.7943
8.4834 6.45 600 0.5438 0.5866 0.9381 0.0619 7.9943
8.4834 6.99 650 0.4720 0.5806 0.9333 0.0667 8.0057
8.4834 7.53 700 0.4305 0.5764 0.9333 0.0667 8.0057
8.4834 8.06 750 0.3931 0.5654 0.9333 0.0667 8.2971
8.4834 8.6 800 0.3450 0.4576 0.9952 0.0048 7.7086
8.4834 9.14 850 0.2773 0.3226 0.8238 0.1762 7.8743
8.4834 9.68 900 0.2184 0.2368 0.7286 0.2714 7.2171
8.4834 10.22 950 0.1992 0.2165 0.6333 0.3667 7.4343
0.7362 10.75 1000 0.1887 0.2097 0.5714 0.4286 7.5829
0.7362 11.29 1050 0.1815 0.2216 0.5905 0.4095 7.6171
0.7362 11.83 1100 0.1688 0.2046 0.5762 0.4238 7.4629
0.7362 12.37 1150 0.1679 0.2012 0.5286 0.4714 7.7143
0.7362 12.9 1200 0.1579 0.1952 0.5333 0.4667 7.5257
0.7362 13.44 1250 0.1531 0.1969 0.5095 0.4905 7.5714
0.7362 13.98 1300 0.1484 0.1935 0.4952 0.5048 7.5543
0.7362 14.52 1350 0.1481 0.1969 0.4952 0.5048 7.5886
0.7362 15.05 1400 0.1417 0.191 0.481 0.519 7.5829
0.7362 15.59 1450 0.1429 0.1876 0.4762 0.5238 7.5829
0.195 16.13 1500 0.1407 0.1834 0.481 0.519 7.48
0.195 16.67 1550 0.1409 0.1995 0.481 0.519 7.7086
0.195 17.2 1600 0.1432 0.1817 0.4762 0.5238 7.4857
0.195 17.74 1650 0.1439 0.1885 0.4762 0.5238 7.5429
0.195 18.28 1700 0.1385 0.1766 0.4476 0.5524 7.5143
0.195 18.82 1750 0.1357 0.1834 0.4762 0.5238 7.4971
0.195 19.35 1800 0.1349 0.1935 0.4714 0.5286 7.4686
0.195 19.89 1850 0.1355 0.1842 0.4286 0.5714 7.5371
0.195 20.43 1900 0.1343 0.1902 0.4619 0.5381 7.5714
0.195 20.97 1950 0.1348 0.1808 0.4619 0.5381 7.4229
0.1287 21.51 2000 0.1341 0.1817 0.4524 0.5476 7.4571
0.1287 22.04 2050 0.1324 0.1868 0.4476 0.5524 7.5371
0.1287 22.58 2100 0.1329 0.1859 0.4571 0.5429 7.4571
0.1287 23.12 2150 0.1367 0.1868 0.4476 0.5524 7.56
0.1287 23.66 2200 0.1389 0.1919 0.4667 0.5333 7.48
0.1287 24.19 2250 0.1385 0.18 0.4333 0.5667 7.5029
0.1287 24.73 2300 0.1429 0.1944 0.4905 0.5095 7.4171
0.1287 25.27 2350 0.1414 0.1961 0.4667 0.5333 7.6057
0.1287 25.81 2400 0.1419 0.1876 0.4333 0.5667 7.5371
0.1287 26.34 2450 0.1433 0.1927 0.4667 0.5333 7.5886
0.0977 26.88 2500 0.1433 0.1927 0.4571 0.5429 7.5486
0.0977 27.42 2550 0.1413 0.1978 0.4524 0.5476 7.6457

Framework versions

  • Transformers 4.33.0
  • Pytorch 2.0.0
  • Datasets 2.1.0
  • Tokenizers 0.13.3
Downloads last month
2

Finetuned from