shubhambhawsar's picture
End of training
326f64c verified
---
license: mit
base_model: facebook/m2m100_418M
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: m2m100_418M-finetuned-en-to-hi
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
# m2m100_418M-finetuned-en-to-hi
This model is a fine-tuned version of [facebook/m2m100_418M](https://huggingface.co/facebook/m2m100_418M) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0453
- Bleu: 17.4993
- Gen Len: 6.7284
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 48
- eval_batch_size: 48
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
- mixed_precision_training: Native AMP
### Training results
| Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
| 2.4274 | 0.16 | 500 | 2.1152 | 4.4935 | 6.8813 |
| 2.1915 | 0.33 | 1000 | 1.9722 | 5.8486 | 6.9727 |
| 2.1187 | 0.49 | 1500 | 1.8575 | 5.5802 | 6.9993 |
| 2.0151 | 0.66 | 2000 | 1.7686 | 8.8892 | 6.8233 |
| 1.9709 | 0.82 | 2500 | 1.6948 | 8.4082 | 6.8809 |
| 1.9376 | 0.99 | 3000 | 1.6341 | 10.0801 | 6.85 |
| 1.761 | 1.15 | 3500 | 1.5788 | 8.1916 | 6.8816 |
| 1.7269 | 1.32 | 4000 | 1.5380 | 10.2779 | 6.9447 |
| 1.7231 | 1.48 | 4500 | 1.4946 | 6.9244 | 6.9402 |
| 1.6925 | 1.65 | 5000 | 1.4456 | 13.7246 | 6.9018 |
| 1.6658 | 1.81 | 5500 | 1.4146 | 9.1181 | 6.9104 |
| 1.6673 | 1.98 | 6000 | 1.3727 | 8.6535 | 6.8682 |
| 1.5165 | 2.14 | 6500 | 1.3441 | 14.8146 | 6.9804 |
| 1.5111 | 2.31 | 7000 | 1.3101 | 11.192 | 6.92 |
| 1.4889 | 2.47 | 7500 | 1.2814 | 11.8364 | 6.9509 |
| 1.4903 | 2.64 | 8000 | 1.2510 | 16.8035 | 6.9316 |
| 1.4871 | 2.8 | 8500 | 1.2298 | 14.5766 | 6.9053 |
| 1.4854 | 2.97 | 9000 | 1.2051 | 14.2822 | 6.8438 |
| 1.3719 | 3.13 | 9500 | 1.1758 | 16.1779 | 6.8918 |
| 1.3481 | 3.3 | 10000 | 1.1612 | 20.1789 | 6.8138 |
| 1.3585 | 3.46 | 10500 | 1.1410 | 15.6937 | 6.8613 |
| 1.35 | 3.63 | 11000 | 1.1261 | 20.0808 | 6.832 |
| 1.3557 | 3.79 | 11500 | 1.1069 | 19.588 | 6.8242 |
| 1.3329 | 3.96 | 12000 | 1.0924 | 19.9913 | 6.796 |
| 1.2792 | 4.12 | 12500 | 1.0791 | 18.8275 | 6.7616 |
| 1.2568 | 4.29 | 13000 | 1.0701 | 16.7189 | 6.7676 |
| 1.2558 | 4.45 | 13500 | 1.0605 | 18.7687 | 6.7464 |
| 1.2533 | 4.62 | 14000 | 1.0541 | 19.1818 | 6.7693 |
| 1.2559 | 4.78 | 14500 | 1.0475 | 19.0462 | 6.738 |
| 1.2513 | 4.95 | 15000 | 1.0453 | 17.4993 | 6.7284 |
### Framework versions
- Transformers 4.36.2
- Pytorch 2.1.2+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0