byt5-small-finetuned-yiddish-experiment-9

This model is a fine-tuned version of google/byt5-small on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.3473
  • Cer: 0.1505
  • Wer: 0.4678

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 8
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_steps: 600
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss Cer Wer
10.7996 0.4728 100 10.9325 0.2905 0.7232
7.586 0.9456 200 10.5771 0.2698 0.6850
8.641 1.4161 300 10.0041 0.2570 0.6571
8.2901 1.8889 400 9.1435 0.2478 0.6396
8.076 2.3593 500 8.1677 0.2394 0.6277
7.8061 2.8322 600 7.0784 0.2317 0.6142
5.6829 3.3026 700 6.0549 0.2234 0.6094
5.343 3.7754 800 5.0819 0.2187 0.6038
4.8853 4.2459 900 4.2224 0.2157 0.6038
3.8875 4.7187 1000 3.5281 0.2123 0.5990
3.4853 5.1891 1100 2.8204 0.2095 0.5935
2.7984 5.6619 1200 2.2737 0.2039 0.5895
2.2336 6.1324 1300 1.7448 0.2016 0.5823
1.8465 6.6052 1400 1.2905 0.1959 0.5736
1.6188 7.0757 1500 1.1662 0.1945 0.5688
1.3051 7.5485 1600 1.1433 0.1939 0.5704
1.176 8.0189 1700 1.0655 0.1910 0.5672
1.0653 8.4917 1800 0.8529 0.1863 0.5561
0.8965 8.9645 1900 0.7841 0.1686 0.4972
0.7726 9.4350 2000 0.7415 0.1649 0.4956
0.7771 9.9078 2100 0.6933 0.1629 0.4885
0.7366 10.3783 2200 0.6601 0.1616 0.4861
0.6566 10.8511 2300 0.6124 0.1593 0.4853
0.6469 11.3215 2400 0.5665 0.1604 0.4829
0.6077 11.7943 2500 0.5210 0.1576 0.4805
0.5543 12.2648 2600 0.4658 0.1576 0.4781
0.5217 12.7376 2700 0.4372 0.1559 0.4781
0.5023 13.2080 2800 0.4111 0.1570 0.4805
0.4754 13.6809 2900 0.3967 0.1554 0.4741
0.4551 14.1513 3000 0.3880 0.1545 0.4726
0.4416 14.6241 3100 0.3800 0.1538 0.4741
0.4255 15.0946 3200 0.3752 0.1542 0.4749
0.4306 15.5674 3300 0.3724 0.1544 0.4741
0.4072 16.0378 3400 0.3663 0.1538 0.4741
0.4196 16.5106 3500 0.3606 0.1528 0.4726
0.3983 16.9835 3600 0.3635 0.1530 0.4694
0.3915 17.4539 3700 0.3605 0.1524 0.4694
0.4036 17.9267 3800 0.3563 0.1517 0.4686
0.3893 18.3972 3900 0.3558 0.1524 0.4686
0.3846 18.8700 4000 0.3562 0.1525 0.4678
0.3854 19.3404 4100 0.3530 0.1516 0.4670
0.3859 19.8132 4200 0.3523 0.1521 0.4678
0.3777 20.2837 4300 0.3516 0.1519 0.4670
0.3729 20.7565 4400 0.3502 0.1516 0.4678
0.3753 21.2270 4500 0.3497 0.1517 0.4678
0.3712 21.6998 4600 0.3502 0.1514 0.4686
0.3757 22.1702 4700 0.3487 0.1508 0.4678
0.3716 22.6430 4800 0.3488 0.1510 0.4678
0.369 23.1135 4900 0.3479 0.1507 0.4678
0.3808 23.5863 5000 0.3473 0.1505 0.4678
0.3696 24.0567 5100 0.3472 0.1511 0.4686
0.3718 24.5296 5200 0.3468 0.1508 0.4678
0.3651 25.0 5300 0.3466 0.1511 0.4686
0.3747 25.4728 5400 0.3467 0.1508 0.4686
0.3661 25.9456 5500 0.3468 0.1508 0.4686
0.3558 26.4161 5600 0.3472 0.1513 0.4686
0.3782 26.8889 5700 0.3469 0.1511 0.4686
0.3636 27.3593 5800 0.3467 0.1511 0.4686
0.3679 27.8322 5900 0.3466 0.1510 0.4678
0.3615 28.3026 6000 0.3465 0.1511 0.4678
0.3688 28.7754 6100 0.3466 0.1511 0.4678
0.3599 29.2459 6200 0.3466 0.1511 0.4678
0.3696 29.7187 6300 0.3465 0.1511 0.4678

Framework versions

  • Transformers 4.47.0
  • Pytorch 2.5.1+cu121
  • Datasets 2.14.4
  • Tokenizers 0.21.0
Downloads last month
7
Safetensors
Model size
300M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for Addaci/byt5-small-finetuned-yiddish-experiment-9

Base model

google/byt5-small
Finetuned
(20)
this model