oMateos2020 commited on
Commit
e12758d
1 Parent(s): cdddec8

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -19
README.md CHANGED
@@ -18,7 +18,7 @@ model-index:
18
  metrics:
19
  - name: Rouge1
20
  type: rouge
21
- value: 0.0
22
  ---
23
 
24
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -26,14 +26,14 @@ should probably proofread and complete it, then remove this comment. -->
26
 
27
  # pegasus-newsroom-cnn_full-adafactor-bs6
28
 
29
- This model was trained from scratch on the cnn_dailymail dataset.
30
  It achieves the following results on the evaluation set:
31
- - Loss: nan
32
- - Rouge1: 0.0
33
- - Rouge2: 0.0
34
- - Rougel: 0.0
35
- - Rougelsum: 0.0
36
- - Gen Len: 1.0
37
 
38
  ## Model description
39
 
@@ -52,15 +52,15 @@ More information needed
52
  ### Training hyperparameters
53
 
54
  The following hyperparameters were used during training:
55
- - learning_rate: 0.00016
56
- - train_batch_size: 6
57
- - eval_batch_size: 6
58
  - seed: 42
59
- - gradient_accumulation_steps: 16
60
- - total_train_batch_size: 96
61
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
62
  - lr_scheduler_type: linear
63
- - lr_scheduler_warmup_steps: 598
64
  - num_epochs: 1
65
  - mixed_precision_training: Native AMP
66
  - label_smoothing_factor: 0.1
@@ -69,11 +69,8 @@ The following hyperparameters were used during training:
69
 
70
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
71
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
72
- | 3.2894 | 0.1 | 299 | 2.9464 | 39.4079 | 18.3064 | 28.093 | 36.5182 | 64.6904 |
73
- | 3.0427 | 0.2 | 598 | 2.9307 | 39.4265 | 18.2924 | 28.247 | 36.6382 | 60.5696 |
74
- | 3.1017 | 0.3 | 897 | 2.9891 | 39.0977 | 17.9198 | 27.9078 | 36.2363 | 58.5172 |
75
- | 3.2891 | 0.4 | 1196 | 3.5756 | 29.5555 | 11.7552 | 22.4675 | 27.2432 | 45.0232 |
76
- | 637.0317 | 0.5 | 1495 | nan | 0.0 | 0.0 | 0.0 | 0.0 | 1.0 |
77
 
78
 
79
  ### Framework versions
 
18
  metrics:
19
  - name: Rouge1
20
  type: rouge
21
+ value: 44.1026
22
  ---
23
 
24
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
26
 
27
  # pegasus-newsroom-cnn_full-adafactor-bs6
28
 
29
+ This model is a fine-tuned version of [oMateos2020/pegasus-newsroom-cnn_full-adafactor-bs6](https://huggingface.co/oMateos2020/pegasus-newsroom-cnn_full-adafactor-bs6) on the cnn_dailymail dataset.
30
  It achieves the following results on the evaluation set:
31
+ - Loss: 2.8671
32
+ - Rouge1: 44.1026
33
+ - Rouge2: 21.4261
34
+ - Rougel: 31.2033
35
+ - Rougelsum: 41.0324
36
+ - Gen Len: 72.0839
37
 
38
  ## Model description
39
 
 
52
  ### Training hyperparameters
53
 
54
  The following hyperparameters were used during training:
55
+ - learning_rate: 6.4e-05
56
+ - train_batch_size: 4
57
+ - eval_batch_size: 4
58
  - seed: 42
59
+ - gradient_accumulation_steps: 64
60
+ - total_train_batch_size: 256
61
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
62
  - lr_scheduler_type: linear
63
+ - lr_scheduler_warmup_steps: 500
64
  - num_epochs: 1
65
  - mixed_precision_training: Native AMP
66
  - label_smoothing_factor: 0.1
 
69
 
70
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
71
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
72
+ | 2.9343 | 0.5 | 560 | 2.8733 | 44.1226 | 21.4087 | 31.2431 | 41.0683 | 69.367 |
73
+ | 2.9855 | 1.0 | 1120 | 2.8671 | 44.1026 | 21.4261 | 31.2033 | 41.0324 | 72.0839 |
 
 
 
74
 
75
 
76
  ### Framework versions