shubhambhawsar commited on
Commit
326f64c
1 Parent(s): fde79f1

End of training

Browse files
README.md CHANGED
@@ -1,3 +1,91 @@
1
- ---
2
- license: unknown
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ base_model: facebook/m2m100_418M
4
+ tags:
5
+ - generated_from_trainer
6
+ metrics:
7
+ - bleu
8
+ model-index:
9
+ - name: m2m100_418M-finetuned-en-to-hi
10
+ results: []
11
+ ---
12
+
13
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
14
+ should probably proofread and complete it, then remove this comment. -->
15
+
16
+ # m2m100_418M-finetuned-en-to-hi
17
+
18
+ This model is a fine-tuned version of [facebook/m2m100_418M](https://huggingface.co/facebook/m2m100_418M) on the None dataset.
19
+ It achieves the following results on the evaluation set:
20
+ - Loss: 1.0453
21
+ - Bleu: 17.4993
22
+ - Gen Len: 6.7284
23
+
24
+ ## Model description
25
+
26
+ More information needed
27
+
28
+ ## Intended uses & limitations
29
+
30
+ More information needed
31
+
32
+ ## Training and evaluation data
33
+
34
+ More information needed
35
+
36
+ ## Training procedure
37
+
38
+ ### Training hyperparameters
39
+
40
+ The following hyperparameters were used during training:
41
+ - learning_rate: 2e-05
42
+ - train_batch_size: 48
43
+ - eval_batch_size: 48
44
+ - seed: 42
45
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
46
+ - lr_scheduler_type: linear
47
+ - num_epochs: 5
48
+ - mixed_precision_training: Native AMP
49
+
50
+ ### Training results
51
+
52
+ | Training Loss | Epoch | Step | Validation Loss | Bleu | Gen Len |
53
+ |:-------------:|:-----:|:-----:|:---------------:|:-------:|:-------:|
54
+ | 2.4274 | 0.16 | 500 | 2.1152 | 4.4935 | 6.8813 |
55
+ | 2.1915 | 0.33 | 1000 | 1.9722 | 5.8486 | 6.9727 |
56
+ | 2.1187 | 0.49 | 1500 | 1.8575 | 5.5802 | 6.9993 |
57
+ | 2.0151 | 0.66 | 2000 | 1.7686 | 8.8892 | 6.8233 |
58
+ | 1.9709 | 0.82 | 2500 | 1.6948 | 8.4082 | 6.8809 |
59
+ | 1.9376 | 0.99 | 3000 | 1.6341 | 10.0801 | 6.85 |
60
+ | 1.761 | 1.15 | 3500 | 1.5788 | 8.1916 | 6.8816 |
61
+ | 1.7269 | 1.32 | 4000 | 1.5380 | 10.2779 | 6.9447 |
62
+ | 1.7231 | 1.48 | 4500 | 1.4946 | 6.9244 | 6.9402 |
63
+ | 1.6925 | 1.65 | 5000 | 1.4456 | 13.7246 | 6.9018 |
64
+ | 1.6658 | 1.81 | 5500 | 1.4146 | 9.1181 | 6.9104 |
65
+ | 1.6673 | 1.98 | 6000 | 1.3727 | 8.6535 | 6.8682 |
66
+ | 1.5165 | 2.14 | 6500 | 1.3441 | 14.8146 | 6.9804 |
67
+ | 1.5111 | 2.31 | 7000 | 1.3101 | 11.192 | 6.92 |
68
+ | 1.4889 | 2.47 | 7500 | 1.2814 | 11.8364 | 6.9509 |
69
+ | 1.4903 | 2.64 | 8000 | 1.2510 | 16.8035 | 6.9316 |
70
+ | 1.4871 | 2.8 | 8500 | 1.2298 | 14.5766 | 6.9053 |
71
+ | 1.4854 | 2.97 | 9000 | 1.2051 | 14.2822 | 6.8438 |
72
+ | 1.3719 | 3.13 | 9500 | 1.1758 | 16.1779 | 6.8918 |
73
+ | 1.3481 | 3.3 | 10000 | 1.1612 | 20.1789 | 6.8138 |
74
+ | 1.3585 | 3.46 | 10500 | 1.1410 | 15.6937 | 6.8613 |
75
+ | 1.35 | 3.63 | 11000 | 1.1261 | 20.0808 | 6.832 |
76
+ | 1.3557 | 3.79 | 11500 | 1.1069 | 19.588 | 6.8242 |
77
+ | 1.3329 | 3.96 | 12000 | 1.0924 | 19.9913 | 6.796 |
78
+ | 1.2792 | 4.12 | 12500 | 1.0791 | 18.8275 | 6.7616 |
79
+ | 1.2568 | 4.29 | 13000 | 1.0701 | 16.7189 | 6.7676 |
80
+ | 1.2558 | 4.45 | 13500 | 1.0605 | 18.7687 | 6.7464 |
81
+ | 1.2533 | 4.62 | 14000 | 1.0541 | 19.1818 | 6.7693 |
82
+ | 1.2559 | 4.78 | 14500 | 1.0475 | 19.0462 | 6.738 |
83
+ | 1.2513 | 4.95 | 15000 | 1.0453 | 17.4993 | 6.7284 |
84
+
85
+
86
+ ### Framework versions
87
+
88
+ - Transformers 4.36.2
89
+ - Pytorch 2.1.2+cu121
90
+ - Datasets 2.16.1
91
+ - Tokenizers 0.15.0
generation_config.json ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 0,
3
+ "decoder_start_token_id": 2,
4
+ "early_stopping": true,
5
+ "eos_token_id": 2,
6
+ "max_length": 200,
7
+ "num_beams": 5,
8
+ "pad_token_id": 1,
9
+ "transformers_version": "4.36.2"
10
+ }
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bc6b3b56d84d820a140d31c9ce1a661507d53977bce63311ea6fe85582971021
3
  size 1935681888
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f1299ae279d94807bb6e6b74b3bad8946aa6a8c734be76e2236d0d0c153d0113
3
  size 1935681888
runs/Jun03_15-43-14_WellsFargo/events.out.tfevents.1717409598.WellsFargo.3750839.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ede67b3f30f78ff68e8264bc9fd8d561ca8f1c03bc066edf7bc76d7336a84ffe
3
- size 19045
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:71b13bd2cc2106cddbe4787840be7db0fd28b77f699cd94631b7ad40153c107a
3
+ size 20980