t5-mt-en-ca
This model is a fine-tuned version of t5-small on the opus_books dataset. It achieves the following results on the evaluation set:
- Loss: 2.2444
- Bleu: 1.9924
- Gen Len: 17.2964
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 16
- eval_batch_size: 16
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100
- mixed_precision_training: Native AMP
Training results
Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
---|---|---|---|---|---|
No log | 1.0 | 231 | 3.9148 | 0.1683 | 17.2649 |
No log | 2.0 | 462 | 3.6731 | 0.1568 | 17.6819 |
4.1865 | 3.0 | 693 | 3.5163 | 0.2006 | 17.7144 |
4.1865 | 4.0 | 924 | 3.3951 | 0.2983 | 17.5233 |
3.7413 | 5.0 | 1155 | 3.2961 | 0.3487 | 17.4517 |
3.7413 | 6.0 | 1386 | 3.2153 | 0.3698 | 17.4213 |
3.5136 | 7.0 | 1617 | 3.1464 | 0.4649 | 17.367 |
3.5136 | 8.0 | 1848 | 3.0885 | 0.528 | 17.3181 |
3.3438 | 9.0 | 2079 | 3.0353 | 0.5732 | 17.2638 |
3.3438 | 10.0 | 2310 | 2.9903 | 0.6168 | 17.24 |
3.226 | 11.0 | 2541 | 2.9470 | 0.6037 | 17.2476 |
3.226 | 12.0 | 2772 | 2.9100 | 0.6071 | 17.2856 |
3.1273 | 13.0 | 3003 | 2.8735 | 0.7135 | 17.2562 |
3.1273 | 14.0 | 3234 | 2.8400 | 0.7844 | 17.291 |
3.1273 | 15.0 | 3465 | 2.8125 | 0.7642 | 17.2649 |
3.0446 | 16.0 | 3696 | 2.7848 | 0.7874 | 17.2552 |
3.0446 | 17.0 | 3927 | 2.7594 | 0.7701 | 17.266 |
2.9717 | 18.0 | 4158 | 2.7335 | 0.8199 | 17.317 |
2.9717 | 19.0 | 4389 | 2.7096 | 0.8848 | 17.2812 |
2.9026 | 20.0 | 4620 | 2.6913 | 0.9185 | 17.2942 |
2.9026 | 21.0 | 4851 | 2.6728 | 0.9304 | 17.2997 |
2.8527 | 22.0 | 5082 | 2.6529 | 0.9424 | 17.2758 |
2.8527 | 23.0 | 5313 | 2.6350 | 0.9681 | 17.2801 |
2.8026 | 24.0 | 5544 | 2.6209 | 1.065 | 17.2856 |
2.8026 | 25.0 | 5775 | 2.6031 | 1.0636 | 17.2443 |
2.7559 | 26.0 | 6006 | 2.5882 | 1.0406 | 17.2476 |
2.7559 | 27.0 | 6237 | 2.5722 | 1.0967 | 17.241 |
2.7559 | 28.0 | 6468 | 2.5621 | 1.1424 | 17.2486 |
2.7094 | 29.0 | 6699 | 2.5472 | 1.1675 | 17.2226 |
2.7094 | 30.0 | 6930 | 2.5356 | 1.1882 | 17.2454 |
2.6703 | 31.0 | 7161 | 2.5226 | 1.1994 | 17.2747 |
2.6703 | 32.0 | 7392 | 2.5116 | 1.2601 | 17.266 |
2.6343 | 33.0 | 7623 | 2.5017 | 1.2126 | 17.2389 |
2.6343 | 34.0 | 7854 | 2.4905 | 1.2105 | 17.2432 |
2.6114 | 35.0 | 8085 | 2.4795 | 1.2356 | 17.2215 |
2.6114 | 36.0 | 8316 | 2.4713 | 1.2904 | 17.2497 |
2.5778 | 37.0 | 8547 | 2.4599 | 1.291 | 17.2193 |
2.5778 | 38.0 | 8778 | 2.4523 | 1.3017 | 17.2313 |
2.5475 | 39.0 | 9009 | 2.4413 | 1.3076 | 17.2389 |
2.5475 | 40.0 | 9240 | 2.4350 | 1.3536 | 17.2508 |
2.5475 | 41.0 | 9471 | 2.4277 | 1.3899 | 17.2182 |
2.5255 | 42.0 | 9702 | 2.4195 | 1.4112 | 17.2421 |
2.5255 | 43.0 | 9933 | 2.4117 | 1.4328 | 17.2562 |
2.4996 | 44.0 | 10164 | 2.4059 | 1.4373 | 17.2226 |
2.4996 | 45.0 | 10395 | 2.3974 | 1.4887 | 17.2204 |
2.4748 | 46.0 | 10626 | 2.3909 | 1.4829 | 17.2269 |
2.4748 | 47.0 | 10857 | 2.3863 | 1.5417 | 17.2682 |
2.4563 | 48.0 | 11088 | 2.3785 | 1.5502 | 17.2182 |
2.4563 | 49.0 | 11319 | 2.3717 | 1.609 | 17.2313 |
2.4363 | 50.0 | 11550 | 2.3661 | 1.576 | 17.2573 |
2.4363 | 51.0 | 11781 | 2.3628 | 1.61 | 17.2465 |
2.4182 | 52.0 | 12012 | 2.3568 | 1.6118 | 17.2476 |
2.4182 | 53.0 | 12243 | 2.3498 | 1.6268 | 17.2389 |
2.4182 | 54.0 | 12474 | 2.3430 | 1.5769 | 17.2519 |
2.4 | 55.0 | 12705 | 2.3404 | 1.6465 | 17.2432 |
2.4 | 56.0 | 12936 | 2.3363 | 1.6708 | 17.2508 |
2.3825 | 57.0 | 13167 | 2.3322 | 1.6851 | 17.2714 |
2.3825 | 58.0 | 13398 | 2.3273 | 1.6938 | 17.253 |
2.3689 | 59.0 | 13629 | 2.3229 | 1.729 | 17.2693 |
2.3689 | 60.0 | 13860 | 2.3187 | 1.7584 | 17.2519 |
2.3586 | 61.0 | 14091 | 2.3144 | 1.7604 | 17.2161 |
2.3586 | 62.0 | 14322 | 2.3101 | 1.7821 | 17.2204 |
2.3433 | 63.0 | 14553 | 2.3072 | 1.7585 | 17.2356 |
2.3433 | 64.0 | 14784 | 2.3027 | 1.7544 | 17.2269 |
2.3294 | 65.0 | 15015 | 2.3009 | 1.8058 | 17.2226 |
2.3294 | 66.0 | 15246 | 2.2964 | 1.7876 | 17.2182 |
2.3294 | 67.0 | 15477 | 2.2941 | 1.7765 | 17.2476 |
2.3129 | 68.0 | 15708 | 2.2898 | 1.747 | 17.2541 |
2.3129 | 69.0 | 15939 | 2.2878 | 1.7628 | 17.2486 |
2.3102 | 70.0 | 16170 | 2.2845 | 1.7721 | 17.2345 |
2.3102 | 71.0 | 16401 | 2.2829 | 1.803 | 17.2334 |
2.2949 | 72.0 | 16632 | 2.2786 | 1.7698 | 17.2161 |
2.2949 | 73.0 | 16863 | 2.2754 | 1.786 | 17.2302 |
2.2895 | 74.0 | 17094 | 2.2746 | 1.7973 | 17.2552 |
2.2895 | 75.0 | 17325 | 2.2710 | 1.7891 | 17.2747 |
2.2803 | 76.0 | 17556 | 2.2709 | 1.8304 | 17.2497 |
2.2803 | 77.0 | 17787 | 2.2682 | 1.822 | 17.2443 |
2.2697 | 78.0 | 18018 | 2.2653 | 1.819 | 17.2736 |
2.2697 | 79.0 | 18249 | 2.2634 | 1.8169 | 17.279 |
2.2697 | 80.0 | 18480 | 2.2619 | 1.8322 | 17.2747 |
2.2649 | 81.0 | 18711 | 2.2612 | 1.8546 | 17.2541 |
2.2649 | 82.0 | 18942 | 2.2582 | 1.868 | 17.2986 |
2.2582 | 83.0 | 19173 | 2.2575 | 1.9165 | 17.2856 |
2.2582 | 84.0 | 19404 | 2.2563 | 1.9389 | 17.2725 |
2.2556 | 85.0 | 19635 | 2.2543 | 1.9548 | 17.2834 |
2.2556 | 86.0 | 19866 | 2.2528 | 1.9543 | 17.2932 |
2.2516 | 87.0 | 20097 | 2.2512 | 1.9483 | 17.2856 |
2.2516 | 88.0 | 20328 | 2.2506 | 1.9439 | 17.2942 |
2.2475 | 89.0 | 20559 | 2.2499 | 1.9672 | 17.2801 |
2.2475 | 90.0 | 20790 | 2.2490 | 1.9569 | 17.2866 |
2.2373 | 91.0 | 21021 | 2.2479 | 1.9708 | 17.2671 |
2.2373 | 92.0 | 21252 | 2.2468 | 1.9655 | 17.2834 |
2.2373 | 93.0 | 21483 | 2.2461 | 1.9695 | 17.2845 |
2.2399 | 94.0 | 21714 | 2.2455 | 1.9703 | 17.2888 |
2.2399 | 95.0 | 21945 | 2.2453 | 1.9728 | 17.2877 |
2.2381 | 96.0 | 22176 | 2.2453 | 1.9734 | 17.2758 |
2.2381 | 97.0 | 22407 | 2.2447 | 1.9855 | 17.2921 |
2.237 | 98.0 | 22638 | 2.2444 | 1.9912 | 17.2975 |
2.237 | 99.0 | 22869 | 2.2445 | 1.9924 | 17.2964 |
2.2283 | 100.0 | 23100 | 2.2444 | 1.9924 | 17.2964 |
Framework versions
- Transformers 4.28.1
- Pytorch 2.0.0+cu118
- Datasets 2.12.0
- Tokenizers 0.13.3
- Downloads last month
- 1
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.