Ahmed235 commited on
Commit
868b950
1 Parent(s): 3c285de

End of training

Browse files
Files changed (1) hide show
  1. README.md +16 -16
README.md CHANGED
@@ -3,8 +3,6 @@ license: apache-2.0
3
  base_model: google-t5/t5-small
4
  tags:
5
  - generated_from_trainer
6
- metrics:
7
- - rouge
8
  model-index:
9
  - name: summarize
10
  results: []
@@ -17,12 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 2.7414
21
- - Rouge1: 0.1691
22
- - Rouge2: 0.0572
23
- - Rougel: 0.1342
24
- - Rougelsum: 0.1342
25
- - Gen Len: 19.0
26
 
27
  ## Model description
28
 
@@ -47,18 +42,23 @@ The following hyperparameters were used during training:
47
  - seed: 42
48
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
49
  - lr_scheduler_type: linear
50
- - num_epochs: 5
51
  - mixed_precision_training: Native AMP
52
 
53
  ### Training results
54
 
55
- | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
56
- |:-------------:|:-----:|:----:|:---------------:|:------:|:------:|:------:|:---------:|:-------:|
57
- | 3.1704 | 1.0 | 500 | 2.8278 | 0.1619 | 0.0527 | 0.1283 | 0.1283 | 19.0 |
58
- | 2.9742 | 2.0 | 1000 | 2.7769 | 0.1666 | 0.0553 | 0.1322 | 0.1322 | 19.0 |
59
- | 2.9285 | 3.0 | 1500 | 2.7561 | 0.1674 | 0.0562 | 0.1326 | 0.1326 | 19.0 |
60
- | 2.903 | 4.0 | 2000 | 2.7452 | 0.1679 | 0.0562 | 0.1329 | 0.1329 | 19.0 |
61
- | 2.8917 | 5.0 | 2500 | 2.7414 | 0.1691 | 0.0572 | 0.1342 | 0.1342 | 19.0 |
 
 
 
 
 
62
 
63
 
64
  ### Framework versions
 
3
  base_model: google-t5/t5-small
4
  tags:
5
  - generated_from_trainer
 
 
6
  model-index:
7
  - name: summarize
8
  results: []
 
15
 
16
  This model is a fine-tuned version of [google-t5/t5-small](https://huggingface.co/google-t5/t5-small) on the None dataset.
17
  It achieves the following results on the evaluation set:
18
+ - Loss: 2.6935
19
+ - Evaluation: {'evaluation_runtime': 28.518348455429077, 'samples_per_second': 33.3118869588378, 'steps_per_second': 33.3118869588378}
20
+ - Rounded Rouge: {'rouge1': 0.1705, 'rouge2': 0.0588, 'rougeL': 0.1354, 'rougeLsum': 0.1355}
 
 
 
21
 
22
  ## Model description
23
 
 
42
  - seed: 42
43
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
44
  - lr_scheduler_type: linear
45
+ - num_epochs: 10
46
  - mixed_precision_training: Native AMP
47
 
48
  ### Training results
49
 
50
+ | Training Loss | Epoch | Step | Validation Loss | Evaluation | Rounded Rouge |
51
+ |:-------------:|:-----:|:----:|:---------------:|:----------------------------------------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------:|
52
+ | 3.1701 | 1.0 | 500 | 2.8229 | {'evaluation_runtime': 30.270989179611206, 'samples_per_second': 31.383183230756966, 'steps_per_second': 31.383183230756966} | {'rouge1': 0.1615, 'rouge2': 0.0525, 'rougeL': 0.128, 'rougeLsum': 0.1281} |
53
+ | 2.9661 | 2.0 | 1000 | 2.7672 | {'evaluation_runtime': 28.879830598831177, 'samples_per_second': 32.894929793613414, 'steps_per_second': 32.894929793613414} | {'rouge1': 0.1676, 'rouge2': 0.0567, 'rougeL': 0.1326, 'rougeLsum': 0.1327} |
54
+ | 2.9128 | 3.0 | 1500 | 2.7414 | {'evaluation_runtime': 28.787310361862183, 'samples_per_second': 33.00065160858421, 'steps_per_second': 33.00065160858421} | {'rouge1': 0.1693, 'rouge2': 0.0575, 'rougeL': 0.1342, 'rougeLsum': 0.1343} |
55
+ | 2.8783 | 4.0 | 2000 | 2.7240 | {'evaluation_runtime': 28.755173683166504, 'samples_per_second': 33.03753301814126, 'steps_per_second': 33.03753301814126} | {'rouge1': 0.1694, 'rouge2': 0.0581, 'rougeL': 0.1343, 'rougeLsum': 0.1344} |
56
+ | 2.8548 | 5.0 | 2500 | 2.7137 | {'evaluation_runtime': 30.050004959106445, 'samples_per_second': 31.613971488284534, 'steps_per_second': 31.613971488284534} | {'rouge1': 0.171, 'rouge2': 0.0591, 'rougeL': 0.1354, 'rougeLsum': 0.1354} |
57
+ | 2.8353 | 6.0 | 3000 | 2.7047 | {'evaluation_runtime': 29.376569986343384, 'samples_per_second': 32.33869714679546, 'steps_per_second': 32.33869714679546} | {'rouge1': 0.1703, 'rouge2': 0.0587, 'rougeL': 0.135, 'rougeLsum': 0.135} |
58
+ | 2.8229 | 7.0 | 3500 | 2.6996 | {'evaluation_runtime': 27.381307363510132, 'samples_per_second': 34.69520236517353, 'steps_per_second': 34.69520236517353} | {'rouge1': 0.1714, 'rouge2': 0.0592, 'rougeL': 0.1357, 'rougeLsum': 0.1357} |
59
+ | 2.8154 | 8.0 | 4000 | 2.6958 | {'evaluation_runtime': 27.409220457077026, 'samples_per_second': 34.65986934899169, 'steps_per_second': 34.65986934899169} | {'rouge1': 0.17, 'rouge2': 0.0587, 'rougeL': 0.1351, 'rougeLsum': 0.1352} |
60
+ | 2.8068 | 9.0 | 4500 | 2.6943 | {'evaluation_runtime': 27.376741409301758, 'samples_per_second': 34.7009889086807, 'steps_per_second': 34.7009889086807} | {'rouge1': 0.1702, 'rouge2': 0.0588, 'rougeL': 0.1352, 'rougeLsum': 0.1353} |
61
+ | 2.8 | 10.0 | 5000 | 2.6935 | {'evaluation_runtime': 28.518348455429077, 'samples_per_second': 33.3118869588378, 'steps_per_second': 33.3118869588378} | {'rouge1': 0.1705, 'rouge2': 0.0588, 'rougeL': 0.1354, 'rougeLsum': 0.1355} |
62
 
63
 
64
  ### Framework versions