File size: 3,552 Bytes
326f64c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
---
license: mit
base_model: facebook/m2m100_418M
tags:
- generated_from_trainer
metrics:
- bleu
model-index:
- name: m2m100_418M-finetuned-en-to-hi
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# m2m100_418M-finetuned-en-to-hi

This model is a fine-tuned version of [facebook/m2m100_418M](https://huggingface.co/facebook/m2m100_418M) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 1.0453
- Bleu: 17.4993
- Gen Len: 6.7284

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 48
- eval_batch_size: 48
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 5
- mixed_precision_training: Native AMP

### Training results

| Training Loss | Epoch | Step  | Validation Loss | Bleu    | Gen Len |
|:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
| 2.4274        | 0.16  | 500   | 2.1152          | 4.4935  | 6.8813  |
| 2.1915        | 0.33  | 1000  | 1.9722          | 5.8486  | 6.9727  |
| 2.1187        | 0.49  | 1500  | 1.8575          | 5.5802  | 6.9993  |
| 2.0151        | 0.66  | 2000  | 1.7686          | 8.8892  | 6.8233  |
| 1.9709        | 0.82  | 2500  | 1.6948          | 8.4082  | 6.8809  |
| 1.9376        | 0.99  | 3000  | 1.6341          | 10.0801 | 6.85    |
| 1.761         | 1.15  | 3500  | 1.5788          | 8.1916  | 6.8816  |
| 1.7269        | 1.32  | 4000  | 1.5380          | 10.2779 | 6.9447  |
| 1.7231        | 1.48  | 4500  | 1.4946          | 6.9244  | 6.9402  |
| 1.6925        | 1.65  | 5000  | 1.4456          | 13.7246 | 6.9018  |
| 1.6658        | 1.81  | 5500  | 1.4146          | 9.1181  | 6.9104  |
| 1.6673        | 1.98  | 6000  | 1.3727          | 8.6535  | 6.8682  |
| 1.5165        | 2.14  | 6500  | 1.3441          | 14.8146 | 6.9804  |
| 1.5111        | 2.31  | 7000  | 1.3101          | 11.192  | 6.92    |
| 1.4889        | 2.47  | 7500  | 1.2814          | 11.8364 | 6.9509  |
| 1.4903        | 2.64  | 8000  | 1.2510          | 16.8035 | 6.9316  |
| 1.4871        | 2.8   | 8500  | 1.2298          | 14.5766 | 6.9053  |
| 1.4854        | 2.97  | 9000  | 1.2051          | 14.2822 | 6.8438  |
| 1.3719        | 3.13  | 9500  | 1.1758          | 16.1779 | 6.8918  |
| 1.3481        | 3.3   | 10000 | 1.1612          | 20.1789 | 6.8138  |
| 1.3585        | 3.46  | 10500 | 1.1410          | 15.6937 | 6.8613  |
| 1.35          | 3.63  | 11000 | 1.1261          | 20.0808 | 6.832   |
| 1.3557        | 3.79  | 11500 | 1.1069          | 19.588  | 6.8242  |
| 1.3329        | 3.96  | 12000 | 1.0924          | 19.9913 | 6.796   |
| 1.2792        | 4.12  | 12500 | 1.0791          | 18.8275 | 6.7616  |
| 1.2568        | 4.29  | 13000 | 1.0701          | 16.7189 | 6.7676  |
| 1.2558        | 4.45  | 13500 | 1.0605          | 18.7687 | 6.7464  |
| 1.2533        | 4.62  | 14000 | 1.0541          | 19.1818 | 6.7693  |
| 1.2559        | 4.78  | 14500 | 1.0475          | 19.0462 | 6.738   |
| 1.2513        | 4.95  | 15000 | 1.0453          | 17.4993 | 6.7284  |


### Framework versions

- Transformers 4.36.2
- Pytorch 2.1.2+cu121
- Datasets 2.16.1
- Tokenizers 0.15.0