sayanmandal
/

t5-small_6_3-en-hi_en_bt

+---
+tags:
+- translation
+- generated_from_trainer
+metrics:
+- bleu
+model-index:
+- name: t5-small_6_3-en-hi_en_bt
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# t5-small_6_3-en-hi_en_bt
+This model was trained from scratch on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 1.9293
+- Bleu: 8.9676
+- Gen Len: 33.391
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 0.0001
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 8
+- total_train_batch_size: 64
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: linear
+- num_epochs: 50
+- mixed_precision_training: Native AMP
+### Training results
+| Training Loss | Epoch | Step  | Validation Loss | Bleu   | Gen Len |
+|:-------------:|:-----:|:-----:|:---------------:|:------:|:-------:|
+| 3.7929        | 1.0   | 526   | 2.6759          | 1.5672 | 16.749  |
+| 3.1151        | 2.0   | 1052  | 2.3843          | 2.2962 | 16.5287 |
+| 2.8701        | 3.0   | 1578  | 2.2287          | 2.8811 | 16.4953 |
+| 2.7121        | 4.0   | 2104  | 2.1302          | 3.3949 | 16.5247 |
+| 2.5844        | 5.0   | 2630  | 2.0593          | 3.8161 | 16.4513 |
+| 2.4917        | 6.0   | 3156  | 2.0063          | 3.9831 | 16.4272 |
+| 2.4067        | 7.0   | 3682  | 1.9733          | 4.0511 | 16.3378 |
+| 2.3395        | 8.0   | 4208  | 1.9399          | 4.3067 | 16.4112 |
+| 2.2713        | 9.0   | 4734  | 1.9148          | 4.3195 | 16.3618 |
+| 2.2217        | 10.0  | 5260  | 1.8961          | 4.3905 | 16.4112 |
+| 2.1659        | 11.0  | 5786  | 1.8787          | 4.4548 | 16.3298 |
+| 2.1267        | 12.0  | 6312  | 1.8651          | 4.5779 | 16.3618 |
+| 2.0793        | 13.0  | 6838  | 1.8540          | 4.4863 | 16.2603 |
+| 2.0473        | 14.0  | 7364  | 1.8444          | 4.556  | 16.3044 |
+| 2.0082        | 15.0  | 7890  | 1.8353          | 4.5957 | 16.3124 |
+| 1.9748        | 16.0  | 8416  | 1.8313          | 4.5593 | 16.3204 |
+| 1.9456        | 17.0  | 8942  | 1.8259          | 4.4522 | 16.2764 |
+| 1.9177        | 18.0  | 9468  | 1.8231          | 4.3345 | 16.3084 |
+| 1.8871        | 19.0  | 9994  | 1.8177          | 4.48   | 16.3458 |
+| 1.8422        | 20.0  | 10520 | 1.8123          | 4.5078 | 16.287  |
+| 1.8161        | 21.0  | 11046 | 1.8106          | 4.3289 | 16.3405 |
+| 1.7972        | 22.0  | 11572 | 1.8106          | 4.5204 | 16.3244 |
+| 1.7785        | 23.0  | 12098 | 1.8117          | 4.4651 | 16.3605 |
+| 1.7563        | 24.0  | 12624 | 1.8125          | 4.3938 | 16.3538 |
+| 1.7444        | 25.0  | 13150 | 1.8089          | 4.5367 | 16.3792 |
+| 1.7256        | 26.0  | 13676 | 1.8075          | 4.4212 | 16.3925 |
+| 1.7021        | 27.0  | 14202 | 1.8080          | 4.5491 | 16.3992 |
+| 1.6969        | 28.0  | 14728 | 1.8061          | 4.6568 | 16.3645 |
+| 1.6766        | 29.0  | 15254 | 1.8063          | 4.6297 | 16.3738 |
+| 1.6653        | 30.0  | 15780 | 1.8095          | 4.6167 | 16.2977 |
+| 1.6543        | 31.0  | 16306 | 1.8085          | 4.5452 | 16.3538 |
+| 1.6413        | 32.0  | 16832 | 1.8112          | 4.6667 | 16.3351 |
+| 1.6293        | 33.0  | 17358 | 1.8126          | 4.6127 | 16.3351 |
+| 1.6204        | 34.0  | 17884 | 1.8115          | 4.7196 | 16.3111 |
+| 1.6082        | 35.0  | 18410 | 1.8134          | 4.7011 | 16.3324 |
+| 1.6048        | 36.0  | 18936 | 1.8122          | 4.6429 | 16.2964 |
+| 1.5911        | 37.0  | 19462 | 1.8143          | 4.6424 | 16.3124 |
+| 1.5834        | 38.0  | 19988 | 1.8131          | 4.6254 | 16.3164 |
+| 1.5742        | 39.0  | 20514 | 1.8154          | 4.6998 | 16.287  |
+| 1.5623        | 40.0  | 21040 | 1.8147          | 4.6469 | 16.3471 |
+| 1.5599        | 41.0  | 21566 | 1.8185          | 4.6654 | 16.3231 |
+| 1.5516        | 42.0  | 22092 | 1.8173          | 4.6961 | 16.3471 |
+| 1.5441        | 43.0  | 22618 | 1.8180          | 4.7176 | 16.3084 |
+| 1.545         | 44.0  | 23144 | 1.8177          | 4.5571 | 16.275  |
+| 1.5418        | 45.0  | 23670 | 1.8195          | 4.5927 | 16.3097 |
+| 1.5329        | 46.0  | 24196 | 1.8187          | 4.7025 | 16.2724 |
+| 1.5348        | 47.0  | 24722 | 1.8198          | 4.6575 | 16.3057 |
+| 1.5362        | 48.0  | 25248 | 1.8197          | 4.6912 | 16.2991 |
+| 1.5231        | 49.0  | 25774 | 1.8202          | 4.6752 | 16.2951 |
+| 1.5314        | 50.0  | 26300 | 1.8208          | 4.6114 | 16.2937 |
+### Framework versions
+- Transformers 4.20.0.dev0
+- Pytorch 1.8.0
+- Datasets 2.1.0
+- Tokenizers 0.12.1