oMateos2020
/

pegasus-newsroom-cnn_full-adafactor-bs6

@@ -18,7 +18,7 @@ model-index:
     metrics:
     - name: Rouge1
       type: rouge
-      value: 0.0
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -26,14 +26,14 @@ should probably proofread and complete it, then remove this comment. -->
 # pegasus-newsroom-cnn_full-adafactor-bs6
-This model was trained from scratch on the cnn_dailymail dataset.
 It achieves the following results on the evaluation set:
-- Loss: nan
-- Rouge1: 0.0
-- Rouge2: 0.0
-- Rougel: 0.0
-- Rougelsum: 0.0
-- Gen Len: 1.0
 ## Model description
@@ -52,15 +52,15 @@ More information needed
 ### Training hyperparameters
 The following hyperparameters were used during training:
-- learning_rate: 0.00016
-- train_batch_size: 6
-- eval_batch_size: 6
 - seed: 42
-- gradient_accumulation_steps: 16
-- total_train_batch_size: 96
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
-- lr_scheduler_warmup_steps: 598
 - num_epochs: 1
 - mixed_precision_training: Native AMP
 - label_smoothing_factor: 0.1
@@ -69,11 +69,8 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
-| 3.2894        | 0.1   | 299  | 2.9464          | 39.4079 | 18.3064 | 28.093  | 36.5182   | 64.6904 |
-| 3.0427        | 0.2   | 598  | 2.9307          | 39.4265 | 18.2924 | 28.247  | 36.6382   | 60.5696 |
-| 3.1017        | 0.3   | 897  | 2.9891          | 39.0977 | 17.9198 | 27.9078 | 36.2363   | 58.5172 |
-| 3.2891        | 0.4   | 1196 | 3.5756          | 29.5555 | 11.7552 | 22.4675 | 27.2432   | 45.0232 |
-| 637.0317      | 0.5   | 1495 | nan             | 0.0     | 0.0     | 0.0     | 0.0       | 1.0     |
 ### Framework versions

     metrics:
     - name: Rouge1
       type: rouge
+      value: 44.1026
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # pegasus-newsroom-cnn_full-adafactor-bs6
+This model is a fine-tuned version of [oMateos2020/pegasus-newsroom-cnn_full-adafactor-bs6](https://huggingface.co/oMateos2020/pegasus-newsroom-cnn_full-adafactor-bs6) on the cnn_dailymail dataset.
 It achieves the following results on the evaluation set:
+- Loss: 2.8671
+- Rouge1: 44.1026
+- Rouge2: 21.4261
+- Rougel: 31.2033
+- Rougelsum: 41.0324
+- Gen Len: 72.0839
 ## Model description
 ### Training hyperparameters
 The following hyperparameters were used during training:
+- learning_rate: 6.4e-05
+- train_batch_size: 4
+- eval_batch_size: 4
 - seed: 42
+- gradient_accumulation_steps: 64
+- total_train_batch_size: 256
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
+- lr_scheduler_warmup_steps: 500
 - num_epochs: 1
 - mixed_precision_training: Native AMP
 - label_smoothing_factor: 0.1
 | Training Loss | Epoch | Step | Validation Loss | Rouge1  | Rouge2  | Rougel  | Rougelsum | Gen Len |
 |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
+| 2.9343        | 0.5   | 560  | 2.8733          | 44.1226 | 21.4087 | 31.2431 | 41.0683   | 69.367  |
+| 2.9855        | 1.0   | 1120 | 2.8671          | 44.1026 | 21.4261 | 31.2033 | 41.0324   | 72.0839 |
 ### Framework versions