File size: 6,945 Bytes
27d1ad8
607f59c
 
 
 
 
 
 
 
 
fc1ccec
607f59c
27d1ad8
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
607f59c
 
 
27d1ad8
 
607f59c
27d1ad8
607f59c
 
27d1ad8
607f59c
 
27d1ad8
607f59c
 
27d1ad8
607f59c
27d1ad8
607f59c
27d1ad8
607f59c
27d1ad8
607f59c
 
27d1ad8
607f59c
27d1ad8
607f59c
27d1ad8
607f59c
 
27d1ad8
607f59c
 
 
27d1ad8
 
607f59c
27d1ad8
 
 
 
 
607f59c
 
 
 
 
 
 
 
 
 
27d1ad8
 
 
 
607f59c
27d1ad8
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
---
license: apache-2.0
language: en
tags:
- deberta-v3-base
- text-classification
- nli
- natural-language-inference
- multitask
- extreme-mtl
- deberta-v3-base
pipeline_tag: zero-shot-classification
datasets:
- hellaswag
- ag_news
- pietrolesci/nli_fever
- numer_sense
- go_emotions
- Ericwang/promptProficiency
- poem_sentiment
- pietrolesci/robust_nli_is_sd
- sileod/probability_words_nli
- social_i_qa
- trec
- pietrolesci/gen_debiased_nli
- snips_built_in_intents
- metaeval/imppres
- metaeval/crowdflower
- tals/vitaminc
- dream
- metaeval/babi_nli
- Ericwang/promptSpoke
- metaeval/ethics
- art
- ai2_arc
- discovery
- Ericwang/promptGrammar
- code_x_glue_cc_clone_detection_big_clone_bench
- prajjwal1/discosense
- pietrolesci/joci
- Anthropic/model-written-evals
- utilitarianism
- emo
- tweets_hate_speech_detection
- piqa
- blog_authorship_corpus
- SpeedOfMagic/ontonotes_english
- circa
- app_reviews
- anli
- Ericwang/promptSentiment
- codah
- definite_pronoun_resolution
- health_fact
- tweet_eval
- hate_speech18
- glue
- hendrycks_test
- paws
- bigbench
- hate_speech_offensive
- blimp
- sick
- turingbench/TuringBench
- martn-nguyen/contrast_nli
- Anthropic/hh-rlhf
- openbookqa
- species_800
- alisawuffles/WANLI
- ethos
- pietrolesci/mpe
- wiki_hop
- pietrolesci/glue_diagnostics
- mc_taco
- quarel
- PiC/phrase_similarity
- strombergnlp/rumoureval_2019
- quail
- acronym_identification
- pietrolesci/robust_nli
- quora
- wnut_17
- dynabench/dynasent
- pietrolesci/gpt3_nli
- truthful_qa
- pietrolesci/add_one_rte
- pietrolesci/breaking_nli
- copenlu/scientific-exaggeration-detection
- medical_questions_pairs
- rotten_tomatoes
- scicite
- scitail
- pietrolesci/dialogue_nli
- code_x_glue_cc_defect_detection
- nightingal3/fig-qa
- pietrolesci/conj_nli
- liar
- sciq
- head_qa
- pietrolesci/dnc
- quartz
- wiqa
- code_x_glue_cc_code_refinement
- Ericwang/promptCoherence
- joey234/nan-nli
- hope_edi
- jnlpba
- yelp_review_full
- pietrolesci/recast_white
- swag
- banking77
- cosmos_qa
- financial_phrasebank
- hans
- pietrolesci/fracas
- math_qa
- conll2003
- qasc
- ncbi_disease
- mwong/fever-evidence-related
- YaHi/EffectiveFeedbackStudentWriting
- ade_corpus_v2
- amazon_polarity
- pietrolesci/robust_nli_li_ts
- super_glue
- adv_glue
- Ericwang/promptNLI
- cos_e
- launch/open_question_type
- lex_glue
- has_part
- pragmeval
- sem_eval_2010_task_8
- imdb
- humicroedit
- sms_spam
- dbpedia_14
- commonsense_qa
- hlgd
- snli
- hyperpartisan_news_detection
- google_wellformed_query
- raquiba/Sarcasm_News_Headline
- metaeval/recast
- winogrande
- relbert/lexical_relation_classification
- metaeval/linguisticprobing
metrics:
- accuracy
library_name: transformers
---

# Model Card for DeBERTa-v3-base-tasksource-nli

DeBERTa model jointly fine-tuned on 444 tasks of the tasksource collection https://github.com/sileod/tasksource/
This is the model with the MNLI classifier on top. Its encoder was trained on many datasets including bigbench, Anthropic/hh-rlhf... alongside many NLI and classification tasks with a SequenceClassification heads while using only one shared encoder.

Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
The number of examples per task was capped to 64. The model was trained for 20k steps with a batch size of 384, a peak learning rate of 2e-5.

You can fine-tune this model to use it for multiple-choice or any classification task (e.g. NLI) like any debertav2 model. 
This model has strong validation performance on many tasks (e.g. 70% on WNLI).

The list of tasks is available in tasks.md

code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing

### Software

https://github.com/sileod/tasknet/
Training took 3 days on 24GB gpu.

## Model Recycling

[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=1.41&mnli_lp=nan&20_newsgroup=0.63&ag_news=0.46&amazon_reviews_multi=-0.40&anli=0.94&boolq=2.55&cb=10.71&cola=0.49&copa=10.60&dbpedia=0.10&esnli=-0.25&financial_phrasebank=1.31&imdb=-0.17&isear=0.63&mnli=0.42&mrpc=-0.23&multirc=1.73&poem_sentiment=0.77&qnli=0.12&qqp=-0.05&rotten_tomatoes=0.67&rte=2.13&sst2=0.01&sst_5bins=-0.02&stsb=1.39&trec_coarse=0.24&trec_fine=0.18&tweet_ev_emoji=0.62&tweet_ev_emotion=0.43&tweet_ev_hate=1.84&tweet_ev_irony=1.43&tweet_ev_offensive=0.17&tweet_ev_sentiment=0.08&wic=-1.78&wnli=3.03&wsc=9.95&yahoo_answers=0.17&model_name=sileod%2Fdeberta-v3-base_tasksource-420&base_name=microsoft%2Fdeberta-v3-base) using sileod/deberta-v3-base_tasksource-420 as a base model yields average score of 80.45 in comparison to 79.04 by microsoft/deberta-v3-base.

An earlier (weaker) version model is ranked 1st among all tested models for the microsoft/deberta-v3-base architecture as of 10/01/2023
Results:

|   20_newsgroup |   ag_news |   amazon_reviews_multi |    anli |   boolq |      cb |    cola |   copa |   dbpedia |   esnli |   financial_phrasebank |   imdb |   isear |    mnli |    mrpc |   multirc |   poem_sentiment |    qnli |     qqp |   rotten_tomatoes |     rte |    sst2 |   sst_5bins |    stsb |   trec_coarse |   trec_fine |   tweet_ev_emoji |   tweet_ev_emotion |   tweet_ev_hate |   tweet_ev_irony |   tweet_ev_offensive |   tweet_ev_sentiment |     wic |    wnli |     wsc |   yahoo_answers |
|---------------:|----------:|-----------------------:|--------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|--------:|----------------:|
|         87.042 |      90.9 |                  66.46 | 59.7188 | 85.5352 | 85.7143 | 87.0566 |     69 |   79.5333 | 91.6735 |                   85.8 | 94.324 | 72.4902 | 90.2055 | 88.9706 |   63.9851 |             87.5 | 93.6299 | 91.7363 |           91.0882 | 84.4765 | 95.0688 |     56.9683 | 91.6654 |            98 |        91.2 |           46.814 |            84.3772 |         58.0471 |            81.25 |              85.2326 |              71.8821 | 69.4357 | 73.2394 | 74.0385 |            72.2 |


For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)

# Citation [optional]

**BibTeX:**

```bib
@misc{sileod23-tasksource,
  author = {Sileo, Damien},
  doi = {10.5281/zenodo.7473446},
  month = {01},
  title = {{tasksource: preprocessings for reproducibility and multitask-learning}},
  url = {https://github.com/sileod/tasksource},
  version = {1.5.0},
  year = {2023}}
```


# Model Card Contact

damien.sileo@inria.fr


</details>