Training complete

Browse files

Files changed (4) hide show

README.md +19 -17
adapter_config.json +2 -2
adapter_model.bin +1 -1
training_args.bin +1 -1

README.md CHANGED Viewed

@@ -17,9 +17,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on the None dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.7738
-- Bleu: 67.4647
-- Gen Len: 75.9455
 ## Model description
@@ -39,27 +39,29 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
-- train_batch_size: 16
-- eval_batch_size: 16
 - seed: 42
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 10
 ### Training results
-| Training Loss | Epoch | Step | Validation Loss | Bleu    | Gen Len |
-|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|
-| No log        | 1.0   | 67   | 2.6896          | 29.0389 | 96.5455 |
-| No log        | 2.0   | 134  | 1.6534          | 30.4693 | 96.6727 |
-| No log        | 3.0   | 201  | 1.2046          | 55.0467 | 76.7455 |
-| No log        | 4.0   | 268  | 1.0048          | 59.5519 | 76.9091 |
-| No log        | 5.0   | 335  | 0.9176          | 64.2229 | 75.5455 |
-| No log        | 6.0   | 402  | 0.8610          | 65.8311 | 73.6909 |
-| No log        | 7.0   | 469  | 0.8160          | 65.5771 | 76.4727 |
-| 1.5731        | 8.0   | 536  | 0.7968          | 67.9558 | 74.7636 |
-| 1.5731        | 9.0   | 603  | 0.7794          | 67.5994 | 75.8    |
-| 1.5731        | 10.0  | 670  | 0.7738          | 67.4647 | 75.9455 |
 ### Framework versions

 This model is a fine-tuned version of [facebook/nllb-200-distilled-600M](https://huggingface.co/facebook/nllb-200-distilled-600M) on the None dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.1142
+- Bleu: 58.3679
+- Gen Len: 74.4727
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 0.0001
+- train_batch_size: 8
+- eval_batch_size: 8
 - seed: 42
+- gradient_accumulation_steps: 4
+- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 10
 ### Training results
+| Training Loss | Epoch | Step | Validation Loss | Bleu    | Gen Len  |
+|:-------------:|:-----:|:----:|:---------------:|:-------:|:--------:|
+| No log        | 0.99  | 33   | 3.5500          | 28.0752 | 98.4545  |
+| No log        | 2.0   | 67   | 2.6889          | 28.4762 | 97.7273  |
+| No log        | 2.99  | 100  | 2.1016          | 13.9425 | 131.9636 |
+| No log        | 4.0   | 134  | 1.6955          | 20.9551 | 114.3091 |
+| No log        | 4.99  | 167  | 1.4578          | 44.5358 | 83.4     |
+| No log        | 6.0   | 201  | 1.2986          | 53.9615 | 75.0545  |
+| No log        | 6.99  | 234  | 1.2113          | 56.6086 | 77.4182  |
+| No log        | 8.0   | 268  | 1.1550          | 57.2346 | 73.8364  |
+| No log        | 8.99  | 301  | 1.1222          | 58.1529 | 74.2     |
+| No log        | 9.85  | 330  | 1.1142          | 58.3679 | 74.4727  |
 ### Framework versions

adapter_config.json CHANGED Viewed

@@ -20,8 +20,8 @@
   "rank_pattern": {},
   "revision": null,
   "target_modules": [
-    "v_proj",
-    "q_proj"
   ],
   "task_type": "SEQ_2_SEQ_LM",
   "use_dora": false,

   "rank_pattern": {},
   "revision": null,
   "target_modules": [
+    "q_proj",
+    "v_proj"
   ],
   "task_type": "SEQ_2_SEQ_LM",
   "use_dora": false,

adapter_model.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d95e076dd55aa2db5b8d4f5e24176646443ea33ff2c8cbff3ca1d8b09db968d2
 size 9490378

 version https://git-lfs.github.com/spec/v1
+oid sha256:57fe5c5a9627945acff7a4c3b2533a16c27cb0237c2a7359e9fb3624bba63c68
 size 9490378

training_args.bin CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:9568172fb4f6db188aab6fa3d5fb9b6a67ac546b3ef822e05a6a5d92fcc3da1c
 size 4664

 version https://git-lfs.github.com/spec/v1
+oid sha256:1a706b21a1a550b35711b310fd213517ab949b1fbc24ae6e053bf6cb0d2a55fb
 size 4664