Edit model card

t5-mt-en-ca

This model is a fine-tuned version of t5-small on the opus_books dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2444
  • Bleu: 1.9924
  • Gen Len: 17.2964

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 1.0 231 3.9148 0.1683 17.2649
No log 2.0 462 3.6731 0.1568 17.6819
4.1865 3.0 693 3.5163 0.2006 17.7144
4.1865 4.0 924 3.3951 0.2983 17.5233
3.7413 5.0 1155 3.2961 0.3487 17.4517
3.7413 6.0 1386 3.2153 0.3698 17.4213
3.5136 7.0 1617 3.1464 0.4649 17.367
3.5136 8.0 1848 3.0885 0.528 17.3181
3.3438 9.0 2079 3.0353 0.5732 17.2638
3.3438 10.0 2310 2.9903 0.6168 17.24
3.226 11.0 2541 2.9470 0.6037 17.2476
3.226 12.0 2772 2.9100 0.6071 17.2856
3.1273 13.0 3003 2.8735 0.7135 17.2562
3.1273 14.0 3234 2.8400 0.7844 17.291
3.1273 15.0 3465 2.8125 0.7642 17.2649
3.0446 16.0 3696 2.7848 0.7874 17.2552
3.0446 17.0 3927 2.7594 0.7701 17.266
2.9717 18.0 4158 2.7335 0.8199 17.317
2.9717 19.0 4389 2.7096 0.8848 17.2812
2.9026 20.0 4620 2.6913 0.9185 17.2942
2.9026 21.0 4851 2.6728 0.9304 17.2997
2.8527 22.0 5082 2.6529 0.9424 17.2758
2.8527 23.0 5313 2.6350 0.9681 17.2801
2.8026 24.0 5544 2.6209 1.065 17.2856
2.8026 25.0 5775 2.6031 1.0636 17.2443
2.7559 26.0 6006 2.5882 1.0406 17.2476
2.7559 27.0 6237 2.5722 1.0967 17.241
2.7559 28.0 6468 2.5621 1.1424 17.2486
2.7094 29.0 6699 2.5472 1.1675 17.2226
2.7094 30.0 6930 2.5356 1.1882 17.2454
2.6703 31.0 7161 2.5226 1.1994 17.2747
2.6703 32.0 7392 2.5116 1.2601 17.266
2.6343 33.0 7623 2.5017 1.2126 17.2389
2.6343 34.0 7854 2.4905 1.2105 17.2432
2.6114 35.0 8085 2.4795 1.2356 17.2215
2.6114 36.0 8316 2.4713 1.2904 17.2497
2.5778 37.0 8547 2.4599 1.291 17.2193
2.5778 38.0 8778 2.4523 1.3017 17.2313
2.5475 39.0 9009 2.4413 1.3076 17.2389
2.5475 40.0 9240 2.4350 1.3536 17.2508
2.5475 41.0 9471 2.4277 1.3899 17.2182
2.5255 42.0 9702 2.4195 1.4112 17.2421
2.5255 43.0 9933 2.4117 1.4328 17.2562
2.4996 44.0 10164 2.4059 1.4373 17.2226
2.4996 45.0 10395 2.3974 1.4887 17.2204
2.4748 46.0 10626 2.3909 1.4829 17.2269
2.4748 47.0 10857 2.3863 1.5417 17.2682
2.4563 48.0 11088 2.3785 1.5502 17.2182
2.4563 49.0 11319 2.3717 1.609 17.2313
2.4363 50.0 11550 2.3661 1.576 17.2573
2.4363 51.0 11781 2.3628 1.61 17.2465
2.4182 52.0 12012 2.3568 1.6118 17.2476
2.4182 53.0 12243 2.3498 1.6268 17.2389
2.4182 54.0 12474 2.3430 1.5769 17.2519
2.4 55.0 12705 2.3404 1.6465 17.2432
2.4 56.0 12936 2.3363 1.6708 17.2508
2.3825 57.0 13167 2.3322 1.6851 17.2714
2.3825 58.0 13398 2.3273 1.6938 17.253
2.3689 59.0 13629 2.3229 1.729 17.2693
2.3689 60.0 13860 2.3187 1.7584 17.2519
2.3586 61.0 14091 2.3144 1.7604 17.2161
2.3586 62.0 14322 2.3101 1.7821 17.2204
2.3433 63.0 14553 2.3072 1.7585 17.2356
2.3433 64.0 14784 2.3027 1.7544 17.2269
2.3294 65.0 15015 2.3009 1.8058 17.2226
2.3294 66.0 15246 2.2964 1.7876 17.2182
2.3294 67.0 15477 2.2941 1.7765 17.2476
2.3129 68.0 15708 2.2898 1.747 17.2541
2.3129 69.0 15939 2.2878 1.7628 17.2486
2.3102 70.0 16170 2.2845 1.7721 17.2345
2.3102 71.0 16401 2.2829 1.803 17.2334
2.2949 72.0 16632 2.2786 1.7698 17.2161
2.2949 73.0 16863 2.2754 1.786 17.2302
2.2895 74.0 17094 2.2746 1.7973 17.2552
2.2895 75.0 17325 2.2710 1.7891 17.2747
2.2803 76.0 17556 2.2709 1.8304 17.2497
2.2803 77.0 17787 2.2682 1.822 17.2443
2.2697 78.0 18018 2.2653 1.819 17.2736
2.2697 79.0 18249 2.2634 1.8169 17.279
2.2697 80.0 18480 2.2619 1.8322 17.2747
2.2649 81.0 18711 2.2612 1.8546 17.2541
2.2649 82.0 18942 2.2582 1.868 17.2986
2.2582 83.0 19173 2.2575 1.9165 17.2856
2.2582 84.0 19404 2.2563 1.9389 17.2725
2.2556 85.0 19635 2.2543 1.9548 17.2834
2.2556 86.0 19866 2.2528 1.9543 17.2932
2.2516 87.0 20097 2.2512 1.9483 17.2856
2.2516 88.0 20328 2.2506 1.9439 17.2942
2.2475 89.0 20559 2.2499 1.9672 17.2801
2.2475 90.0 20790 2.2490 1.9569 17.2866
2.2373 91.0 21021 2.2479 1.9708 17.2671
2.2373 92.0 21252 2.2468 1.9655 17.2834
2.2373 93.0 21483 2.2461 1.9695 17.2845
2.2399 94.0 21714 2.2455 1.9703 17.2888
2.2399 95.0 21945 2.2453 1.9728 17.2877
2.2381 96.0 22176 2.2453 1.9734 17.2758
2.2381 97.0 22407 2.2447 1.9855 17.2921
2.237 98.0 22638 2.2444 1.9912 17.2975
2.237 99.0 22869 2.2445 1.9924 17.2964
2.2283 100.0 23100 2.2444 1.9924 17.2964

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0+cu118
  • Datasets 2.12.0
  • Tokenizers 0.13.3
Downloads last month
2

Dataset used to train judithrosell/t5-mt-en-ca

Evaluation results