Model save

Files changed (4) hide show

README.md CHANGED Viewed

@@ -3,19 +3,22 @@ license: mit
 base_model: MubarakB/m2m100-lg-to-en-v2
 tags:
 - generated_from_trainer
 model-index:
-- name: m2m100-lg-to-en-v4
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
-# m2m100-lg-to-en-v4
 This model is a fine-tuned version of [MubarakB/m2m100-lg-to-en-v2](https://huggingface.co/MubarakB/m2m100-lg-to-en-v2) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.5477
 ## Model description
@@ -34,9 +37,9 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.0001
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
@@ -45,13 +48,14 @@ The following hyperparameters were used during training:
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss |
-|:-------------:|:-----:|:----:|:---------------:|
-| 2.8978        | 1.0   | 119  | 0.5258          |
-| 0.4898        | 2.0   | 238  | 0.5082          |
-| 0.4377        | 3.0   | 357  | 0.5151          |
-| 0.367         | 4.0   | 476  | 0.5277          |
-| 0.3086        | 5.0   | 595  | 0.5477          |
 ### Framework versions

 base_model: MubarakB/m2m100-lg-to-en-v2
 tags:
 - generated_from_trainer
+metrics:
+- bleu
 model-index:
+- name: m2m100-lg-to-en-v5
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 should probably proofread and complete it, then remove this comment. -->
+# m2m100-lg-to-en-v5
 This model is a fine-tuned version of [MubarakB/m2m100-lg-to-en-v2](https://huggingface.co/MubarakB/m2m100-lg-to-en-v2) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.5196
+- Bleu: 0.1371
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 2e-05
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Bleu   |
+|:-------------:|:-----:|:----:|:---------------:|:------:|
+| 2.6476        | 1.0   | 237  | 0.7225          | 0.3822 |
+| 0.5312        | 2.0   | 474  | 0.5011          | 0.0973 |
+| 0.4211        | 3.0   | 711  | 0.4924          | 0.3545 |
+| 0.4015        | 4.0   | 948  | 0.4957          | 0.1006 |
+| 0.3431        | 5.0   | 1185 | 0.5096          | 0.1953 |
+| 0.2992        | 6.0   | 1422 | 0.5196          | 0.1371 |
 ### Framework versions

generation_config.json CHANGED Viewed

@@ -1,6 +1,10 @@
 {
   "early_stopping": true,
   "max_length": 200,
   "num_beams": 5,
   "transformers_version": "4.41.1"
 }

 {
+  "bos_token_id": 0,
+  "decoder_start_token_id": 2,
   "early_stopping": true,
+  "eos_token_id": 2,
   "max_length": 200,
   "num_beams": 5,
+  "pad_token_id": 1,
   "transformers_version": "4.41.1"
 }

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:89dc4b1e2ae1b6bb837e39da28b6688cc82c6cc001c67d5f4bea873cb9713df5
 size 1935681888

 version https://git-lfs.github.com/spec/v1
+oid sha256:731a22f81641566cbb87b884398fc58f42d4073215deaa927ccb1522b357da7e
 size 1935681888

runs/Jun03_04-22-12_049d933e36e0/events.out.tfevents.1717388534.049d933e36e0.34.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b9bef89cae778de95885057d5c5889abb03f5cfbc0fd81acd407db572aaedfdf
-size 10138

 version https://git-lfs.github.com/spec/v1
+oid sha256:b70741bc5102aadc745e945988de358bbc7a96821b201442463af91299dfeb19
+size 10492