Edit model card

t5-small_6_3-en-hi_en_bt

This model was trained from scratch on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.9293
  • Bleu: 8.9676
  • Gen Len: 33.391

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0001
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 50
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss Bleu Gen Len
3.7929 1.0 526 2.6759 1.5672 16.749
3.1151 2.0 1052 2.3843 2.2962 16.5287
2.8701 3.0 1578 2.2287 2.8811 16.4953
2.7121 4.0 2104 2.1302 3.3949 16.5247
2.5844 5.0 2630 2.0593 3.8161 16.4513
2.4917 6.0 3156 2.0063 3.9831 16.4272
2.4067 7.0 3682 1.9733 4.0511 16.3378
2.3395 8.0 4208 1.9399 4.3067 16.4112
2.2713 9.0 4734 1.9148 4.3195 16.3618
2.2217 10.0 5260 1.8961 4.3905 16.4112
2.1659 11.0 5786 1.8787 4.4548 16.3298
2.1267 12.0 6312 1.8651 4.5779 16.3618
2.0793 13.0 6838 1.8540 4.4863 16.2603
2.0473 14.0 7364 1.8444 4.556 16.3044
2.0082 15.0 7890 1.8353 4.5957 16.3124
1.9748 16.0 8416 1.8313 4.5593 16.3204
1.9456 17.0 8942 1.8259 4.4522 16.2764
1.9177 18.0 9468 1.8231 4.3345 16.3084
1.8871 19.0 9994 1.8177 4.48 16.3458
1.8422 20.0 10520 1.8123 4.5078 16.287
1.8161 21.0 11046 1.8106 4.3289 16.3405
1.7972 22.0 11572 1.8106 4.5204 16.3244
1.7785 23.0 12098 1.8117 4.4651 16.3605
1.7563 24.0 12624 1.8125 4.3938 16.3538
1.7444 25.0 13150 1.8089 4.5367 16.3792
1.7256 26.0 13676 1.8075 4.4212 16.3925
1.7021 27.0 14202 1.8080 4.5491 16.3992
1.6969 28.0 14728 1.8061 4.6568 16.3645
1.6766 29.0 15254 1.8063 4.6297 16.3738
1.6653 30.0 15780 1.8095 4.6167 16.2977
1.6543 31.0 16306 1.8085 4.5452 16.3538
1.6413 32.0 16832 1.8112 4.6667 16.3351
1.6293 33.0 17358 1.8126 4.6127 16.3351
1.6204 34.0 17884 1.8115 4.7196 16.3111
1.6082 35.0 18410 1.8134 4.7011 16.3324
1.6048 36.0 18936 1.8122 4.6429 16.2964
1.5911 37.0 19462 1.8143 4.6424 16.3124
1.5834 38.0 19988 1.8131 4.6254 16.3164
1.5742 39.0 20514 1.8154 4.6998 16.287
1.5623 40.0 21040 1.8147 4.6469 16.3471
1.5599 41.0 21566 1.8185 4.6654 16.3231
1.5516 42.0 22092 1.8173 4.6961 16.3471
1.5441 43.0 22618 1.8180 4.7176 16.3084
1.545 44.0 23144 1.8177 4.5571 16.275
1.5418 45.0 23670 1.8195 4.5927 16.3097
1.5329 46.0 24196 1.8187 4.7025 16.2724
1.5348 47.0 24722 1.8198 4.6575 16.3057
1.5362 48.0 25248 1.8197 4.6912 16.2991
1.5231 49.0 25774 1.8202 4.6752 16.2951
1.5314 50.0 26300 1.8208 4.6114 16.2937

Framework versions

  • Transformers 4.20.0.dev0
  • Pytorch 1.8.0
  • Datasets 2.1.0
  • Tokenizers 0.12.1
Downloads last month
8