Edit model card

synpre_mix_v3_1M_t5-small

This model is a fine-tuned version of t5-small on the tyzhu/synpre_mix_v3_1M dataset. It achieves the following results on the evaluation set:

  • Loss: 0.1201
  • Bleu: 93.5646
  • Gen Len: 87.5554

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: inverse_sqrt
  • lr_scheduler_warmup_steps: 10000
  • training_steps: 200000

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
8.1288 1.28 5000 8.2984 1.092 154.6861
7.7066 2.56 10000 7.8456 2.2572 120.3898
7.3474 3.84 15000 7.3513 4.6007 86.3161
6.5043 5.12 20000 6.2626 7.884 84.2931
2.0667 6.4 25000 2.5516 51.5595 104.0468
1.0434 7.68 30000 1.2005 79.5292 90.2053
0.7833 8.96 35000 0.8932 86.2323 85.2567
0.5221 10.24 40000 0.5357 83.7473 89.9205
0.4006 11.52 45000 0.4025 86.6179 88.674
0.3183 12.8 50000 0.3176 88.9508 87.5098
0.2735 14.08 55000 0.2720 88.7724 88.3669
0.2452 15.36 60000 0.2520 89.0626 88.4114
0.2142 16.64 65000 0.2355 91.2709 86.769
0.1888 17.92 70000 0.2139 91.2543 87.46
0.1757 19.2 75000 0.2058 91.7017 87.3256
0.1616 20.48 80000 0.2004 91.6796 87.2561
0.1562 21.76 85000 0.1837 92.2346 87.3002
0.1407 23.04 90000 0.1733 92.1041 87.9509
0.1356 24.32 95000 0.1715 93.4019 86.5713
0.1295 25.6 100000 0.1570 93.7442 86.7566
0.127 26.87 105000 0.1649 93.0466 87.1686
0.117 28.15 110000 0.1528 92.9589 87.5743
0.1152 29.43 115000 0.1499 93.7713 86.9094
0.1116 30.71 120000 0.1514 92.8724 87.6156
0.1067 31.99 125000 0.1432 92.7475 87.8559
0.1041 33.27 130000 0.1409 93.6111 87.2048
0.1001 34.55 135000 0.1430 92.8654 87.6548
0.0965 35.83 140000 0.1329 94.0062 87.1653
0.0949 37.11 145000 0.1313 94.3514 86.6624
0.0941 38.39 150000 0.1305 93.802 87.2185
0.0902 39.67 155000 0.1249 94.2611 86.9682
0.0906 40.95 160000 0.1251 94.1046 87.0009
0.0869 42.23 165000 0.1230 93.9196 87.1969
0.0849 43.51 170000 0.1279 94.1902 86.9505
0.0843 44.79 175000 0.1218 93.7524 87.3351
0.0769 46.07 180000 0.1191 93.8624 87.3325
0.078 47.35 185000 0.1139 94.7611 86.7778
0.0774 48.63 190000 0.1237 93.1841 87.7449
0.0786 49.91 195000 0.1135 94.3655 87.0559
0.0736 51.19 200000 0.1201 93.5646 87.5554

Framework versions

  • Transformers 4.34.0
  • Pytorch 2.1.0+cu121
  • Datasets 2.14.5
  • Tokenizers 0.14.1
Downloads last month
1

Finetuned from

Dataset used to train tyzhu/synpre_mix_v3_1M_t5-small

Evaluation results