Commit
·
427c014
1
Parent(s):
8a1acf1
en
Browse files- README.md +14 -33
- generation_config.json +1 -1
- pytorch_model.bin +1 -1
README.md
CHANGED
@@ -1,4 +1,6 @@
|
|
1 |
---
|
|
|
|
|
2 |
license: apache-2.0
|
3 |
tags:
|
4 |
- generated_from_trainer
|
@@ -21,7 +23,7 @@ model-index:
|
|
21 |
metrics:
|
22 |
- name: Rouge1
|
23 |
type: rouge
|
24 |
-
value: 47.
|
25 |
---
|
26 |
|
27 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
@@ -32,10 +34,10 @@ should probably proofread and complete it, then remove this comment. -->
|
|
32 |
This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the samsum dataset.
|
33 |
It achieves the following results on the evaluation set:
|
34 |
- Loss: 1.3772
|
35 |
-
- Rouge1: 47.
|
36 |
-
- Rouge2: 23.
|
37 |
-
- Rougel: 40.
|
38 |
-
- Rougelsum: 43.
|
39 |
- Gen Len: 17.3162
|
40 |
|
41 |
## Model description
|
@@ -67,37 +69,16 @@ The following hyperparameters were used during training:
|
|
67 |
|
68 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
69 |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
|
70 |
-
| 1.
|
71 |
-
| 1.
|
72 |
-
| 1.
|
73 |
-
| 1.
|
74 |
-
| 1.
|
75 |
|
76 |
|
77 |
### Framework versions
|
78 |
|
79 |
-
- Transformers 4.26.
|
80 |
- Pytorch 1.13.1+cu116
|
81 |
-
- Datasets 2.
|
82 |
- Tokenizers 0.13.2
|
83 |
-
|
84 |
-
### Papers With Code Results
|
85 |
-
|
86 |
-
As of 2 February 2023 the Papers with Code page for this task has the following leaderboard (see imgur link).
|
87 |
-
|
88 |
-
Our score (Rouge 1 score of 47.4798) puts this model's performance between fourth and fifth place on the leaderboard: https://i.imgur.com/Nea77uL.jpg
|
89 |
-
|
90 |
-
## Model Recycling
|
91 |
-
|
92 |
-
[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=9.04&mnli_lp=nan&20_newsgroup=3.55&ag_news=1.66&amazon_reviews_multi=0.19&anli=14.53&boolq=16.60&cb=24.91&cola=10.35&copa=25.50&dbpedia=5.73&esnli=5.31&financial_phrasebank=19.96&imdb=0.05&isear=0.59&mnli=11.74&mrpc=15.89&multirc=5.99&poem_sentiment=23.27&qnli=3.93&qqp=5.54&rotten_tomatoes=3.54&rte=23.90&sst2=-0.14&sst_5bins=5.12&stsb=20.58&trec_coarse=4.15&trec_fine=10.93&tweet_ev_emoji=12.87&tweet_ev_emotion=6.02&tweet_ev_hate=-0.04&tweet_ev_irony=7.12&tweet_ev_offensive=2.16&tweet_ev_sentiment=-0.00&wic=12.03&wnli=9.44&wsc=9.37&yahoo_answers=3.04&model_name=andreaparker%2Fflan-t5-base-samsum&base_name=google%2Ft5-v1_1-base) using andreaparker/flan-t5-base-samsum as a base model yields average score of 77.86 in comparison to 68.82 by google/t5-v1_1-base.
|
93 |
-
|
94 |
-
The model is ranked 2nd among all tested models for the google/t5-v1_1-base architecture as of 07/February/2023
|
95 |
-
Results:
|
96 |
-
|
97 |
-
| 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers |
|
98 |
-
|---------------:|----------:|-----------------------:|--------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|-------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|-------:|--------:|----------------:|
|
99 |
-
| 86.4312 | 89.8333 | 67.1 | 52.5937 | 82.1713 | 80.3571 | 80.5369 | 66 | 76.5 | 90.8897 | 86.7 | 93.044 | 71.6428 | 87.2457 | 88.7255 | 62.1287 | 91.3462 | 93.3004 | 89.1393 | 89.5872 | 84.4765 | 93.578 | 56.9683 | 89.3674 | 97.4 | 93 | 46.334 | 81.6327 | 51.4815 | 74.7449 | 84.7674 | 69.8795 | 67.8683 | 56.338 | 57.6923 | 72.3 |
|
100 |
-
|
101 |
-
|
102 |
-
For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)
|
103 |
-
|
|
|
1 |
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
license: apache-2.0
|
5 |
tags:
|
6 |
- generated_from_trainer
|
|
|
23 |
metrics:
|
24 |
- name: Rouge1
|
25 |
type: rouge
|
26 |
+
value: 47.4485
|
27 |
---
|
28 |
|
29 |
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
|
|
34 |
This model is a fine-tuned version of [google/flan-t5-base](https://huggingface.co/google/flan-t5-base) on the samsum dataset.
|
35 |
It achieves the following results on the evaluation set:
|
36 |
- Loss: 1.3772
|
37 |
+
- Rouge1: 47.4485
|
38 |
+
- Rouge2: 23.938
|
39 |
+
- Rougel: 40.0491
|
40 |
+
- Rougelsum: 43.6954
|
41 |
- Gen Len: 17.3162
|
42 |
|
43 |
## Model description
|
|
|
69 |
|
70 |
| Training Loss | Epoch | Step | Validation Loss | Rouge1 | Rouge2 | Rougel | Rougelsum | Gen Len |
|
71 |
|:-------------:|:-----:|:----:|:---------------:|:-------:|:-------:|:-------:|:---------:|:-------:|
|
72 |
+
| 1.4264 | 1.0 | 1842 | 1.3829 | 46.5615 | 23.1026 | 39.4012 | 42.9128 | 17.0977 |
|
73 |
+
| 1.3527 | 2.0 | 3684 | 1.3732 | 47.1096 | 23.4582 | 39.5488 | 43.2577 | 17.4554 |
|
74 |
+
| 1.2554 | 3.0 | 5526 | 1.3709 | 46.9079 | 23.29 | 39.5731 | 43.1779 | 17.2027 |
|
75 |
+
| 1.2503 | 4.0 | 7368 | 1.3736 | 47.4506 | 23.7238 | 39.9803 | 43.5976 | 17.2198 |
|
76 |
+
| 1.1675 | 5.0 | 9210 | 1.3772 | 47.4485 | 23.938 | 40.0491 | 43.6954 | 17.3162 |
|
77 |
|
78 |
|
79 |
### Framework versions
|
80 |
|
81 |
+
- Transformers 4.26.1
|
82 |
- Pytorch 1.13.1+cu116
|
83 |
+
- Datasets 2.10.1
|
84 |
- Tokenizers 0.13.2
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
generation_config.json
CHANGED
@@ -3,5 +3,5 @@
|
|
3 |
"decoder_start_token_id": 0,
|
4 |
"eos_token_id": 1,
|
5 |
"pad_token_id": 0,
|
6 |
-
"transformers_version": "4.26.
|
7 |
}
|
|
|
3 |
"decoder_start_token_id": 0,
|
4 |
"eos_token_id": 1,
|
5 |
"pad_token_id": 0,
|
6 |
+
"transformers_version": "4.26.1"
|
7 |
}
|
pytorch_model.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 990408885
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:a70e77be9b6270e9ec28accaee7fb761fe6627fbc979d38c205591fd4bb33bfa
|
3 |
size 990408885
|