lilferrit
/

ft-wmt14

@@ -17,9 +17,9 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- Loss: 1.7469
-- Bleu: 23.6596
-- Gen Len: 27.526
 ## Model description
@@ -44,7 +44,7 @@ The following hyperparameters were used during training:
 - seed: 42
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 16
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - training_steps: 100000
@@ -52,16 +52,16 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch  | Step   | Validation Loss | Bleu    | Gen Len |
 |:-------------:|:------:|:------:|:---------------:|:-------:|:-------:|
-| 1.7738        | 0.2778 | 10000  | 1.9146          | 20.1598 | 28.1563 |
-| 1.6498        | 0.5556 | 20000  | 1.8550          | 21.4167 | 27.853  |
-| 1.5903        | 0.8333 | 30000  | 1.8277          | 22.604  | 27.7613 |
-| 1.5151        | 1.1111 | 40000  | 1.8128          | 22.1273 | 27.3187 |
-| 1.4866        | 1.3889 | 50000  | 1.7999          | 22.8295 | 27.419  |
-| 1.4696        | 1.6667 | 60000  | 1.7810          | 22.9923 | 27.7387 |
-| 1.4508        | 1.9444 | 70000  | 1.7654          | 23.1046 | 27.7057 |
-| 1.4053        | 2.2222 | 80000  | 1.7587          | 23.5079 | 27.643  |
-| 1.3956        | 2.5    | 90000  | 1.7525          | 23.3848 | 27.6637 |
-| 1.3903        | 2.7778 | 100000 | 1.7469          | 23.6596 | 27.526  |
 ### Framework versions

 This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 1.7607
+- Bleu: 23.421
+- Gen Len: 27.6243
 ## Model description
 - seed: 42
 - gradient_accumulation_steps: 2
 - total_train_batch_size: 16
+- optimizer: Adafactor
 - lr_scheduler_type: linear
 - training_steps: 100000
 | Training Loss | Epoch  | Step   | Validation Loss | Bleu    | Gen Len |
 |:-------------:|:------:|:------:|:---------------:|:-------:|:-------:|
+| 1.7882        | 0.2778 | 10000  | 1.9278          | 19.7853 | 28.4147 |
+| 1.6619        | 0.5556 | 20000  | 1.8710          | 21.3803 | 27.667  |
+| 1.6007        | 0.8333 | 30000  | 1.8397          | 22.2715 | 27.317  |
+| 1.5269        | 1.1111 | 40000  | 1.8205          | 21.9329 | 27.704  |
+| 1.498         | 1.3889 | 50000  | 1.8134          | 22.4836 | 27.63   |
+| 1.4801        | 1.6667 | 60000  | 1.7941          | 22.727  | 27.582  |
+| 1.462         | 1.9444 | 70000  | 1.7766          | 23.0372 | 27.5903 |
+| 1.4182        | 2.2222 | 80000  | 1.7724          | 23.6231 | 27.4233 |
+| 1.4079        | 2.5    | 90000  | 1.7663          | 23.2604 | 27.7623 |
+| 1.4037        | 2.7778 | 100000 | 1.7607          | 23.421  | 27.6243 |
 ### Framework versions

model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:38d7e52f2b62e40978f99e6e678503e2e57871d6f06102e2cbb84f57ab58ebd7
 size 241984552

 version https://git-lfs.github.com/spec/v1
+oid sha256:126657589c9bed87d7d908daa53833a4e1cdc2e808eaee0c62189876a1169c78
 size 241984552