eladven commited on
Commit
b44784e
·
1 Parent(s): 316363d

Evaluation results for mwong/roberta-base-climate-evidence-related model as a base model for other tasks

Browse files

As part of a research effort to identify high quality models in Huggingface that can serve as base models for further finetuning, we evaluated this by finetuning on 36 datasets. The model ranks 3rd among all tested models for the roberta-base architecture as of 21/12/2022.


To share this information with others in your model card, please add the following evaluation results to your README.md page.

For more information please see https://ibm.github.io/model-recycling/ or contact me.

Best regards,
Elad Venezian
eladv@il.ibm.com
IBM Research AI

Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -15,4 +15,17 @@ metrics: f1
15
 
16
  # ClimateRoberta
17
 
18
- ClimateRoberta is a classifier model that predicts if climate related evidence is related to query claim. The model achieved F1 score of 80.13% with test dataset "mwong/climate-evidence-related". Using pretrained roberta-base model, the classifier head is trained on Fever dataset and adapted to climate domain using ClimateFever dataset.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  # ClimateRoberta
17
 
18
+ ClimateRoberta is a classifier model that predicts if climate related evidence is related to query claim. The model achieved F1 score of 80.13% with test dataset "mwong/climate-evidence-related". Using pretrained roberta-base model, the classifier head is trained on Fever dataset and adapted to climate domain using ClimateFever dataset.
19
+ ## Model Recycling
20
+
21
+ [Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=0.98&mnli_lp=nan&20_newsgroup=-0.15&ag_news=0.16&amazon_reviews_multi=-0.04&anli=-0.13&boolq=-6.29&cb=9.93&cola=-0.31&copa=35.90&dbpedia=0.41&esnli=-1.35&financial_phrasebank=-0.51&imdb=0.09&isear=0.67&mnli=0.14&mrpc=2.09&multirc=25.91&poem_sentiment=-0.29&qnli=-0.11&qqp=-0.78&rotten_tomatoes=0.51&rte=-0.20&sst2=0.95&sst_5bins=-1.97&stsb=-16.78&trec_coarse=-0.31&trec_fine=-0.36&tweet_ev_emoji=0.27&tweet_ev_emotion=-0.40&tweet_ev_hate=-1.24&tweet_ev_irony=-0.13&tweet_ev_offensive=0.56&tweet_ev_sentiment=-0.69&wic=-10.55&wnli=0.14&wsc=0.19&yahoo_answers=-0.00&model_name=mwong%2Froberta-base-climate-evidence-related&base_name=roberta-base) using mwong/roberta-base-climate-evidence-related as a base model yields average score of 77.21 in comparison to 76.22 by roberta-base.
22
+
23
+ The model is ranked 3rd among all tested models for the roberta-base architecture as of 21/12/2022
24
+ Results:
25
+
26
+ | 20_newsgroup | ag_news | amazon_reviews_multi | anli | boolq | cb | cola | copa | dbpedia | esnli | financial_phrasebank | imdb | isear | mnli | mrpc | multirc | poem_sentiment | qnli | qqp | rotten_tomatoes | rte | sst2 | sst_5bins | stsb | trec_coarse | trec_fine | tweet_ev_emoji | tweet_ev_emotion | tweet_ev_hate | tweet_ev_irony | tweet_ev_offensive | tweet_ev_sentiment | wic | wnli | wsc | yahoo_answers |
27
+ |---------------:|----------:|-----------------------:|--------:|--------:|-----:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|--------:|----------------:|
28
+ | 85.1301 | 89.9333 | 66.54 | 50.2188 | 72.4 | 77.7 | 83.2215 | 84.6 | 77.7 | 89.6478 | 84.6 | 93.988 | 73.1421 | 87.1237 | 89.9576 | 87.1237 | 83.6538 | 92.2936 | 89.9333 | 88.9306 | 72.2022 | 95.0688 | 54.7059 | 73.1421 | 96.8 | 87.4 | 46.572 | 81.4215 | 51.6498 | 71.4286 | 85.1163 | 70.3354 | 54.9296 | 54.9296 | 63.4615 | 72.4 |
29
+
30
+
31
+ For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)