andreaparker commited on
Commit
427c014
1 Parent(s): 8a1acf1
Files changed (3) hide show
  1. README.md +14 -33
  2. generation_config.json +1 -1
  3. pytorch_model.bin +1 -1
README.md CHANGED
@@ -1,4 +1,6 @@
1
  ---
 
 
2
  license: apache-2.0
3
  tags:
4
  - generated_from_trainer
@@ -21,7 +23,7 @@ model-index:
21
  metrics:
22
  - name: Rouge1
23
  type: rouge
24
- value: 47.4798
25
  ---
26
 
27
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -32,10 +34,10 @@ should probably proofread and complete it, then remove this comment. -->
32
  This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the samsum dataset.
33
  It achieves the following results on the evaluation set:
34
  - Loss: 1.3772
35
- - Rouge1: 47.4798
36
- - Rouge2: 23.9756
37
- - Rougel: 40.0392
38
- - Rougelsum: 43.6545
39
  - Gen Len: 17.3162
40
 
41
  ## Model description
@@ -67,37 +69,16 @@ The following hyperparameters were used during training:
67
 
68
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
69
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
70
- | 1.4403 | 1.0 | 1842 | 1.3829 | 46.5346 | 23.1326 | 39.4401 | 42.8272 | 17.0977 |
71
- | 1.3534 | 2.0 | 3684 | 1.3732 | 47.0911 | 23.5074 | 39.5951 | 43.2279 | 17.4554 |
72
- | 1.2795 | 3.0 | 5526 | 1.3709 | 46.8895 | 23.3243 | 39.5909 | 43.1286 | 17.2027 |
73
- | 1.2313 | 4.0 | 7368 | 1.3736 | 47.4946 | 23.7802 | 39.9999 | 43.5903 | 17.2198 |
74
- | 1.1934 | 5.0 | 9210 | 1.3772 | 47.4798 | 23.9756 | 40.0392 | 43.6545 | 17.3162 |
75
 
76
 
77
  ### Framework versions
78
 
79
- - Transformers 4.26.0
80
  - Pytorch 1.13.1+cu116
81
- - Datasets 2.9.0
82
  - Tokenizers 0.13.2
83
-
84
- ### Papers With Code Results
85
-
86
- As of 2 February 2023 the Papers with Code page for this task has the following leaderboard (see imgur link).
87
-
88
- Our score (Rouge 1 score of 47.4798) puts this model's performance between fourth and fifth place on the leaderboard: https://i.imgur.com/Nea77uL.jpg
89
-
90
- ## Model Recycling
91
-
92
- [Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=9.04&mnli_lp=nan&20_newsgroup=3.55&ag_news=1.66&amazon_reviews_multi=0.19&anli=14.53&boolq=16.60&cb=24.91&cola=10.35&copa=25.50&dbpedia=5.73&esnli=5.31&financial_phrasebank=19.96&imdb=0.05&isear=0.59&mnli=11.74&mrpc=15.89&multirc=5.99&poem_sentiment=23.27&qnli=3.93&qqp=5.54&rotten_tomatoes=3.54&rte=23.90&sst2=-0.14&sst_5bins=5.12&stsb=20.58&trec_coarse=4.15&trec_fine=10.93&tweet_ev_emoji=12.87&tweet_ev_emotion=6.02&tweet_ev_hate=-0.04&tweet_ev_irony=7.12&tweet_ev_offensive=2.16&tweet_ev_sentiment=-0.00&wic=12.03&wnli=9.44&wsc=9.37&yahoo_answers=3.04&model_name=andreaparker%2Fflan-t5-base-samsum&base_name=google%2Ft5-v1_1-base) using andreaparker/flan-t5-base-samsum as a base model yields average score of 77.86 in comparison to 68.82 by google/t5-v1_1-base.
93
-
94
- The model is ranked 2nd among all tested models for the google/t5-v1_1-base architecture as of 07/February/2023
95
- Results:
96
-
97
- | 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers |
98
- |---------------:|----------:|-----------------------:|--------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|-------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|-------:|--------:|----------------:|
99
- | 86.4312 | 89.8333 | 67.1 | 52.5937 | 82.1713 | 80.3571 | 80.5369 | 66 | 76.5 | 90.8897 | 86.7 | 93.044 | 71.6428 | 87.2457 | 88.7255 | 62.1287 | 91.3462 | 93.3004 | 89.1393 | 89.5872 | 84.4765 | 93.578 | 56.9683 | 89.3674 | 97.4 | 93 | 46.334 | 81.6327 | 51.4815 | 74.7449 | 84.7674 | 69.8795 | 67.8683 | 56.338 | 57.6923 | 72.3 |
100
-
101
-
102
- For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)
103
-
 
1
  ---
2
+ language:
3
+ - en
4
  license: apache-2.0
5
  tags:
6
  - generated_from_trainer
 
23
  metrics:
24
  - name: Rouge1
25
  type: rouge
26
+ value: 47.4485
27
  ---
28
 
29
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
34
  This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the samsum dataset.
35
  It achieves the following results on the evaluation set:
36
  - Loss: 1.3772
37
+ - Rouge1: 47.4485
38
+ - Rouge2: 23.938
39
+ - Rougel: 40.0491
40
+ - Rougelsum: 43.6954
41
  - Gen Len: 17.3162
42
 
43
  ## Model description
 
69
 
70
  | Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
71
  |:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
72
+ | 1.4264 | 1.0 | 1842 | 1.3829 | 46.5615 | 23.1026 | 39.4012 | 42.9128 | 17.0977 |
73
+ | 1.3527 | 2.0 | 3684 | 1.3732 | 47.1096 | 23.4582 | 39.5488 | 43.2577 | 17.4554 |
74
+ | 1.2554 | 3.0 | 5526 | 1.3709 | 46.9079 | 23.29 | 39.5731 | 43.1779 | 17.2027 |
75
+ | 1.2503 | 4.0 | 7368 | 1.3736 | 47.4506 | 23.7238 | 39.9803 | 43.5976 | 17.2198 |
76
+ | 1.1675 | 5.0 | 9210 | 1.3772 | 47.4485 | 23.938 | 40.0491 | 43.6954 | 17.3162 |
77
 
78
 
79
  ### Framework versions
80
 
81
+ - Transformers 4.26.1
82
  - Pytorch 1.13.1+cu116
83
+ - Datasets 2.10.1
84
  - Tokenizers 0.13.2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
generation_config.json CHANGED
@@ -3,5 +3,5 @@
3
  "decoder_start_token_id": 0,
4
  "eos_token_id": 1,
5
  "pad_token_id": 0,
6
- "transformers_version": "4.26.0"
7
  }
 
3
  "decoder_start_token_id": 0,
4
  "eos_token_id": 1,
5
  "pad_token_id": 0,
6
+ "transformers_version": "4.26.1"
7
  }
pytorch_model.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:6967554cd20ce84ece990620be96d288d88ac16d7952849c2e8c873f5a279769
3
  size 990408885
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a70e77be9b6270e9ec28accaee7fb761fe6627fbc979d38c205591fd4bb33bfa
3
  size 990408885