Evaluation results for ibm/ColD-Fusion-bert-base-uncased-itr24-seed0 model as a base model for other tasks
#1
by
eladven
- opened
README.md
CHANGED
@@ -51,6 +51,20 @@ output = model(encoded_input)
|
|
51 |
```
|
52 |
|
53 |
## Evaluation results
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
54 |
See full evaluation results of this model and many more [here](https://ibm.github.io/model-recycling/roberta-base_table.html)
|
55 |
When fine-tuned on downstream tasks, this model achieves the following results:
|
56 |
|
|
|
51 |
```
|
52 |
|
53 |
## Evaluation results
|
54 |
+
|
55 |
+
## Model Recycling
|
56 |
+
|
57 |
+
[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=3.35&mnli_lp=nan&20_newsgroup=2.02&ag_news=-0.49&amazon_reviews_multi=0.06&anli=1.55&boolq=5.48&cb=12.41&cola=-0.33&copa=12.55&dbpedia=0.41&esnli=0.74&financial_phrasebank=13.07&imdb=0.44&isear=0.62&mnli=0.11&mrpc=4.53&multirc=0.20&poem_sentiment=17.93&qnli=0.15&qqp=0.27&rotten_tomatoes=4.92&rte=18.36&sst2=1.49&sst_5bins=4.40&stsb=3.26&trec_coarse=0.54&trec_fine=13.07&tweet_ev_emoji=-0.06&tweet_ev_emotion=1.72&tweet_ev_hate=0.82&tweet_ev_irony=-0.03&tweet_ev_offensive=-0.37&tweet_ev_sentiment=-0.03&wic=2.89&wnli=-2.68&wsc=1.35&yahoo_answers=-0.72&model_name=ibm%2FColD-Fusion-bert-base-uncased-itr24-seed0&base_name=bert-base-uncased) using ibm/ColD-Fusion-bert-base-uncased-itr24-seed0 as a base model yields average score of 75.55 in comparison to 72.20 by bert-base-uncased.
|
58 |
+
|
59 |
+
The model is ranked 2nd among all tested models for the bert-base-uncased architecture as of 09/01/2023
|
60 |
+
Results:
|
61 |
+
|
62 |
+
| 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers |
|
63 |
+
|---------------:|----------:|-----------------------:|-------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|--------:|----------------:|
|
64 |
+
| 85.0637 | 89.1 | 65.98 | 48.5 | 74.4343 | 76.7857 | 81.4957 | 62 | 78.5667 | 90.4418 | 81.6 | 92.02 | 69.6871 | 83.8385 | 86.5196 | 60.1691 | 84.6154 | 90.0238 | 90.5466 | 89.7749 | 78.3394 | 93.4633 | 57.1946 | 89.1234 | 96.6 | 81.4 | 35.944 | 81.6327 | 53.67 | 67.7296 | 85 | 69.4481 | 66.1442 | 47.8873 | 63.4615 | 71.6 |
|
65 |
+
|
66 |
+
|
67 |
+
For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)
|
68 |
See full evaluation results of this model and many more [here](https://ibm.github.io/model-recycling/roberta-base_table.html)
|
69 |
When fine-tuned on downstream tasks, this model achieves the following results:
|
70 |
|