mt5-lithuanian-simplifier-full

This model is a fine-tuned version of google/mt5-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0771
  • Rouge1: 0.7828
  • Rouge2: 0.6494
  • Rougel: 0.7787
  • Gen Len: 48.0191

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • lr_scheduler_warmup_steps: 500
  • num_epochs: 8

Training results

Training Loss Epoch Step Validation Loss Rouge1 Rouge2 Rougel Gen Len
24.351 0.08 200 18.6244 0.0226 0.0018 0.0207 512.0
3.0331 0.16 400 0.6830 0.0549 0.0018 0.0497 49.0191
0.2076 0.24 600 0.1642 0.6417 0.4986 0.6328 48.0191
0.2019 0.32 800 0.1303 0.6713 0.5243 0.6633 48.0191
0.1573 0.4 1000 0.1242 0.7007 0.5589 0.6937 48.0191
0.1687 0.48 1200 0.1158 0.712 0.569 0.7055 48.0191
0.1315 0.56 1400 0.1225 0.6923 0.5361 0.6851 48.0191
0.1376 0.64 1600 0.1108 0.7171 0.5695 0.7105 48.0191
0.158 0.72 1800 0.1074 0.7229 0.574 0.7169 48.0191
0.1221 0.8 2000 0.1064 0.7227 0.5761 0.7166 48.0191
0.1371 0.88 2200 0.1049 0.7282 0.5827 0.7223 48.0191
0.1376 0.96 2400 0.1043 0.73 0.5861 0.7239 48.0191
0.1116 1.04 2600 0.1021 0.733 0.5888 0.727 48.0191
0.132 1.12 2800 0.1012 0.7338 0.5899 0.7277 48.0191
0.131 1.2 3000 0.0997 0.7365 0.5936 0.7307 48.0191
0.1001 1.28 3200 0.0950 0.7408 0.5977 0.7355 48.0191
0.1398 1.36 3400 0.0964 0.7418 0.599 0.7364 48.0191
0.1085 1.44 3600 0.0962 0.744 0.6015 0.7386 48.0191
0.097 1.52 3800 0.0967 0.743 0.6009 0.7377 48.0191
0.1178 1.6 4000 0.0955 0.7446 0.6035 0.7391 48.0191
0.1214 1.68 4200 0.0939 0.7452 0.6036 0.7403 48.0191
0.1539 1.76 4400 0.0909 0.7486 0.6068 0.7436 48.0191
0.1141 1.83 4600 0.0900 0.7518 0.6104 0.7467 48.0191
0.0795 1.91 4800 0.0891 0.7513 0.6097 0.7466 48.0191
0.0856 1.99 5000 0.0915 0.7513 0.6099 0.7463 48.0191
0.0954 2.07 5200 0.0898 0.753 0.6126 0.7482 48.0191
0.1271 2.15 5400 0.0901 0.7534 0.6125 0.7486 48.0191
0.0816 2.23 5600 0.0893 0.7553 0.6148 0.7506 48.0191
0.0922 2.31 5800 0.0881 0.7569 0.6163 0.7521 48.0191
0.1177 2.39 6000 0.0878 0.7575 0.6176 0.7532 48.0191
0.0916 2.47 6200 0.0874 0.7585 0.618 0.7541 48.0191
0.1349 2.55 6400 0.0861 0.76 0.62 0.7555 48.0191
0.1196 2.63 6600 0.0833 0.7617 0.6212 0.7572 48.0191
0.0841 2.71 6800 0.0848 0.7621 0.6219 0.7576 48.0191
0.0934 2.79 7000 0.0854 0.7622 0.6227 0.7577 48.0191
0.1246 2.87 7200 0.0835 0.7652 0.6256 0.7606 48.0191
0.0762 2.95 7400 0.0835 0.7649 0.6262 0.7606 48.0191
0.0924 3.03 7600 0.0828 0.7662 0.6276 0.7618 48.0191
0.0822 3.11 7800 0.0834 0.7664 0.6284 0.7621 48.0191
0.0856 3.19 8000 0.0836 0.7647 0.627 0.7603 48.0191
0.0798 3.27 8200 0.0829 0.7657 0.6284 0.7614 48.0191
0.0959 3.35 8400 0.0828 0.7671 0.6302 0.7629 48.0191
0.0871 3.43 8600 0.0820 0.7672 0.6297 0.763 48.0191
0.1068 3.51 8800 0.0827 0.7683 0.6307 0.7641 48.0191
0.072 3.59 9000 0.0820 0.7684 0.632 0.764 48.0191
0.0964 3.67 9200 0.0838 0.7692 0.6333 0.7645 48.0191
0.0946 3.75 9400 0.0809 0.7707 0.6348 0.7663 48.0191
0.0822 3.83 9600 0.0825 0.7708 0.6347 0.7666 48.0191
0.1019 3.91 9800 0.0788 0.7733 0.6373 0.7692 48.0191
0.08 3.99 10000 0.0797 0.7727 0.6369 0.7686 48.0191
0.0989 4.07 10200 0.0818 0.7724 0.6367 0.7681 48.0191
0.0693 4.15 10400 0.0804 0.7737 0.6378 0.7697 48.0191
0.0763 4.23 10600 0.0814 0.7741 0.6379 0.7699 48.0191
0.0956 4.31 10800 0.0815 0.7726 0.6369 0.7683 48.0191
0.0728 4.39 11000 0.0800 0.7738 0.6374 0.7696 48.0191
0.0652 4.47 11200 0.0795 0.7747 0.6388 0.7708 48.0191
0.0706 4.55 11400 0.0798 0.7742 0.6388 0.7703 48.0191
0.0979 4.63 11600 0.0788 0.7748 0.6387 0.7708 48.0191
0.0771 4.71 11800 0.0797 0.775 0.6402 0.771 48.0191
0.1067 4.79 12000 0.0779 0.7757 0.6404 0.7717 48.0191
0.0773 4.87 12200 0.0783 0.7759 0.6411 0.7721 48.0191
0.0866 4.95 12400 0.0780 0.7773 0.6437 0.7734 48.0191
0.0611 5.03 12600 0.0785 0.7761 0.6418 0.7723 48.0191
0.0685 5.11 12800 0.0781 0.777 0.6421 0.773 48.0191
0.0501 5.19 13000 0.0788 0.7764 0.6411 0.7721 48.0191
0.0626 5.27 13200 0.0792 0.7762 0.6416 0.7721 48.0191
0.0708 5.35 13400 0.0795 0.7761 0.6408 0.772 48.0191
0.055 5.42 13600 0.0779 0.7773 0.642 0.7733 48.0191
0.0749 5.5 13800 0.0789 0.7783 0.6431 0.7742 48.0191
0.0771 5.58 14000 0.0779 0.778 0.6437 0.774 48.0191
0.0906 5.66 14200 0.0779 0.7781 0.6431 0.7742 48.0191
0.0679 5.74 14400 0.0778 0.7783 0.6449 0.7745 48.0191
0.0605 5.82 14600 0.0786 0.7778 0.6439 0.7738 48.0191
0.0647 5.9 14800 0.0781 0.7785 0.6445 0.7743 48.0191
0.058 5.98 15000 0.0775 0.7792 0.6448 0.7749 48.0191
0.0574 6.06 15200 0.0788 0.7793 0.6451 0.7752 48.0191
0.0545 6.14 15400 0.0778 0.7802 0.6464 0.7759 48.0191
0.079 6.22 15600 0.0781 0.7801 0.6466 0.7759 48.0191
0.0474 6.3 15800 0.0782 0.7809 0.6477 0.7768 48.0191
0.0517 6.38 16000 0.0788 0.7809 0.6481 0.7769 48.0191
0.0613 6.46 16200 0.0782 0.7814 0.6481 0.7773 48.0191
0.0517 6.54 16400 0.0785 0.7807 0.6468 0.7767 48.0191
0.0549 6.62 16600 0.0778 0.7817 0.6485 0.7777 48.0191
0.0727 6.7 16800 0.0774 0.7824 0.6493 0.7785 48.0191
0.0768 6.78 17000 0.0784 0.7826 0.6495 0.7785 48.0191
0.0612 6.86 17200 0.0772 0.7818 0.6485 0.7779 48.0191
0.0735 6.94 17400 0.0778 0.7817 0.6484 0.7777 48.0191
0.0662 7.02 17600 0.0780 0.7819 0.6483 0.7778 48.0191
0.0769 7.1 17800 0.0777 0.7823 0.6488 0.7784 48.0191
0.0649 7.18 18000 0.0775 0.7818 0.6482 0.7778 48.0191
0.0749 7.26 18200 0.0774 0.7822 0.6486 0.7781 48.0191
0.0568 7.34 18400 0.0772 0.7825 0.6488 0.7784 48.0191
0.0751 7.42 18600 0.0774 0.7822 0.6486 0.7783 48.0191
0.0564 7.5 18800 0.0773 0.7823 0.6487 0.7782 48.0191
0.0593 7.58 19000 0.0767 0.7826 0.6492 0.7786 48.0191
0.0563 7.66 19200 0.0773 0.7826 0.6497 0.7786 48.0191
0.0686 7.74 19400 0.0771 0.7828 0.6494 0.7789 48.0191
0.0728 7.82 19600 0.0772 0.7823 0.6494 0.7784 48.0191
0.06 7.9 19800 0.0772 0.7826 0.6491 0.7786 48.0191
0.0557 7.98 20000 0.0771 0.7828 0.6494 0.7787 48.0191

Framework versions

  • Transformers 4.36.2
  • Pytorch 2.1.1
  • Datasets 2.16.1
  • Tokenizers 0.15.0
Downloads last month
20
Safetensors
Model size
582M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for eglkan1/mt5-lithuanian-simplifier-full

Base model

google/mt5-base
Finetuned
(168)
this model