Edit model card

t5-small_6_3-en-hi_en_LinCE

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 2.2034
  • Bleu: 7.8135
  • Gen Len: 39.5564

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
No log 0.99 94 3.5424 0.9187 16.7437
No log 1.99 188 3.1434 1.2886 16.8158
No log 2.99 282 2.9494 1.4577 16.7824
No log 3.99 376 2.8233 1.4745 16.8879
No log 4.99 470 2.7300 1.7116 16.6636
3.6303 5.99 564 2.6589 1.7857 16.6302
3.6303 6.99 658 2.6005 1.8572 16.4553
3.6303 7.99 752 2.5456 2.139 16.3925
3.6303 8.99 846 2.5023 2.3835 16.2911
3.6303 9.99 940 2.4725 2.5607 16.3271
2.9087 10.99 1034 2.4272 2.6614 16.3138
2.9087 11.99 1128 2.3977 2.9623 16.3338
2.9087 12.99 1222 2.3686 3.1248 16.2443
2.9087 13.99 1316 2.3438 3.3294 16.3458
2.9087 14.99 1410 2.3253 3.3885 16.3591
2.6588 15.99 1504 2.3028 3.3985 16.3124
2.6588 16.99 1598 2.2839 3.3772 16.3858
2.6588 17.99 1692 2.2704 3.5804 16.3872
2.6588 18.99 1786 2.2533 3.8751 16.2697
2.6588 19.99 1880 2.2378 4.0003 16.3271
2.6588 20.99 1974 2.2233 4.0271 16.3031
2.5079 21.99 2068 2.2160 4.1898 16.3057
2.5079 22.99 2162 2.2010 4.1216 16.3031
2.5079 23.99 2256 2.1935 4.1311 16.2644
2.5079 24.99 2350 2.1833 4.1373 16.3138
2.5079 25.99 2444 2.1725 4.3471 16.3057
2.4027 26.99 2538 2.1657 4.183 16.3298
2.4027 27.99 2632 2.1611 4.2867 16.3351
2.4027 28.99 2726 2.1531 4.2689 16.2737
2.4027 29.99 2820 2.1482 4.4802 16.2644
2.4027 30.99 2914 2.1443 4.469 16.231
2.3251 31.99 3008 2.1375 4.5295 16.227
2.3251 32.99 3102 2.1330 4.4799 16.2243
2.3251 33.99 3196 2.1307 4.7124 16.2417
2.3251 34.99 3290 2.1248 4.5954 16.3004
2.3251 35.99 3384 2.1215 4.7455 16.215
2.3251 36.99 3478 2.1166 4.6233 16.2016
2.2818 37.99 3572 2.1147 4.6843 16.219
2.2818 38.99 3666 2.1112 4.7068 16.2163
2.2818 39.99 3760 2.1071 4.684 16.223
2.2818 40.99 3854 2.1034 4.7323 16.2523
2.2818 41.99 3948 2.0998 4.6406 16.2016
2.2392 42.99 4042 2.1017 4.7609 16.1976
2.2392 43.99 4136 2.1021 4.7634 16.2069
2.2392 44.99 4230 2.0994 4.7854 16.1976
2.2392 45.99 4324 2.0980 4.7562 16.2243
2.2392 46.99 4418 2.0964 4.7921 16.219
2.2192 47.99 4512 2.0970 4.8029 16.2377
2.2192 48.99 4606 2.0967 4.7953 16.2176
2.2192 49.99 4700 2.0968 4.819 16.2457

Framework versions

  • Transformers 4.20.0.dev0
  • Pytorch 1.8.0
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
2