Evaluation results for andreaparker/flan-t5-base-samsum model as a base model for other tasks

by eladven - opened Feb 7, 2023

base: refs/heads/main

←

from: refs/pr/1

Discussion Files changed

+14

-0

Files changed (1) hide show

README.md +14 -0

README.md CHANGED Viewed

	@@ -91,3 +91,17 @@ Our score (Rouge 1 score of 47.4798) puts this model's performance between fourt
91	![PwC leaderboard](https://i.imgur.com/Nea77uL.jpg)
92
93

 ![PwC leaderboard](https://i.imgur.com/Nea77uL.jpg)
+## Model Recycling
+[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=9.04&mnli_lp=nan&20_newsgroup=3.55&ag_news=1.66&amazon_reviews_multi=0.19&anli=14.53&boolq=16.60&cb=24.91&cola=10.35&copa=25.50&dbpedia=5.73&esnli=5.31&financial_phrasebank=19.96&imdb=0.05&isear=0.59&mnli=11.74&mrpc=15.89&multirc=5.99&poem_sentiment=23.27&qnli=3.93&qqp=5.54&rotten_tomatoes=3.54&rte=23.90&sst2=-0.14&sst_5bins=5.12&stsb=20.58&trec_coarse=4.15&trec_fine=10.93&tweet_ev_emoji=12.87&tweet_ev_emotion=6.02&tweet_ev_hate=-0.04&tweet_ev_irony=7.12&tweet_ev_offensive=2.16&tweet_ev_sentiment=-0.00&wic=12.03&wnli=9.44&wsc=9.37&yahoo_answers=3.04&model_name=andreaparker%2Fflan-t5-base-samsum&base_name=google%2Ft5-v1_1-base) using andreaparker/flan-t5-base-samsum as a base model yields average score of 77.86 in comparison to 68.82 by google/t5-v1_1-base.
+The model is ranked 2nd among all tested models for the google/t5-v1_1-base architecture as of 07/02/2023
+Results:
+|   20_newsgroup |   ag_news |   amazon_reviews_multi |    anli |   boolq |      cb |    cola |   copa |   dbpedia |   esnli |   financial_phrasebank |   imdb |   isear |    mnli |    mrpc |   multirc |   poem_sentiment |    qnli |     qqp |   rotten_tomatoes |     rte |   sst2 |   sst_5bins |    stsb |   trec_coarse |   trec_fine |   tweet_ev_emoji |   tweet_ev_emotion |   tweet_ev_hate |   tweet_ev_irony |   tweet_ev_offensive |   tweet_ev_sentiment |     wic |   wnli |     wsc |   yahoo_answers |
+|---------------:|----------:|-----------------------:|--------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|-------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|-------:|--------:|----------------:|
+|        86.4312 |   89.8333 |                   67.1 | 52.5937 | 82.1713 | 80.3571 | 80.5369 |     66 |      76.5 | 90.8897 |                   86.7 | 93.044 | 71.6428 | 87.2457 | 88.7255 |   62.1287 |          91.3462 | 93.3004 | 89.1393 |           89.5872 | 84.4765 | 93.578 |     56.9683 | 89.3674 |          97.4 |          93 |           46.334 |            81.6327 |         51.4815 |          74.7449 |              84.7674 |              69.8795 | 67.8683 | 56.338 | 57.6923 |            72.3 |
+For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)