sileod
/

deberta-v3-base-tasksource-nli

Model card Files Files and versions Community

sileod commited on Jan 13, 2023

Commit

607f59c

1 Parent(s): b0d57dd

Update README.md

Browse files

Files changed (1) hide show

README.md +44 -174

README.md CHANGED Viewed

@@ -1,4 +1,14 @@
 ---
 datasets:
 - hellaswag
 - ag_news
@@ -134,204 +144,64 @@ datasets:
 - winogrande
 - relbert/lexical_relation_classification
 - metaeval/linguisticprobing
 ---
-# Model Card for Model ID
-<!-- Provide a quick summary of what the model is/does. -->
-#  Table of Contents
-1. [Model Details](#model-details)
-2. [Uses](#uses)
-3. [Bias, Risks, and Limitations](#bias-risks-and-limitations)
-4. [Training Details](#training-details)
-5. [Evaluation](#evaluation)
-6. [Model Examination](#model-examination-optional)
-7. [Environmental Impact](#environmental-impact)
-8. [Technical Specifications](#technical-specifications-optional)
-9. [Citation](#citation-optional)
-10. [Glossary](#glossary-optional)
-11. [More Information](#more-information-optional)
-12. [Model Card Authors](#model-card-authors-optional)
-13. [Model Card Contact](#model-card-contact)
-14. [How To Get Started With the Model](#how-to-get-started-with-the-model)
-# Model Details
-## Model Description
-<!-- Provide a longer summary of what this model is. -->
-- **Developed by:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
-- **Model type:** [More Information Needed]
-- **Language(s) (NLP):** [More Information Needed]
-- **License:** [More Information Needed]
-- **Related Models [optional]:** [More Information Needed]
-    - **Parent Model [optional]:** [More Information Needed]
-- **Resources for more information:** [More Information Needed]
-# Uses
-<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->
-## Direct Use
-<!-- This section is for the model use without fine-tuning or plugging into a larger ecosystem/app. -->
-[More Information Needed]
-## Downstream Use [optional]
-<!-- This section is for the model use when fine-tuned for a task, or when plugged into a larger ecosystem/app -->
-[More Information Needed]
-## Out-of-Scope Use
-<!-- This section addresses misuse, malicious use, and uses that the model will not work well for. -->
-[More Information Needed]
-# Bias, Risks, and Limitations
-<!-- This section is meant to convey both technical and sociotechnical limitations. -->
-[More Information Needed]
-## Recommendations
-<!-- This section is meant to convey recommendations with respect to the bias, risk, and technical limitations. -->
-Users (both direct and downstream) should be made aware of the risks, biases and limitations of the model. More information needed for further recomendations.
-# Training Details
-## Training Data
-<!-- This should link to a Data Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
-[More Information Needed]
-## Training Procedure [optional]
-<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
-### Preprocessing
-[More Information Needed]
-### Speeds, Sizes, Times
-<!-- This section provides information about throughput, start/end time, checkpoint size if relevant, etc. -->
-[More Information Needed]
-# Evaluation
-<!-- This section describes the evaluation protocols and provides the results. -->
-## Testing Data, Factors & Metrics
-### Testing Data
-<!-- This should link to a Data Card if possible. -->
-[More Information Needed]
-### Factors
-<!-- These are the things the evaluation is disaggregating by, e.g., subpopulations or domains. -->
-[More Information Needed]
-### Metrics
-<!-- These are the evaluation metrics being used, ideally with a description of why. -->
-[More Information Needed]
-## Results
-[More Information Needed]
-# Model Examination [optional]
-<!-- Relevant interpretability work for the model goes here -->
-[More Information Needed]
-# Environmental Impact
-<!-- Total emissions (in grams of CO2eq) and additional considerations, such as electricity usage, go here. Edit the suggested text below accordingly -->
-Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700).
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-# Technical Specifications [optional]
-## Model Architecture and Objective
-[More Information Needed]
-## Compute Infrastructure
-[More Information Needed]
-### Hardware
-[More Information Needed]
-### Software
-[More Information Needed]
 # Citation [optional]
-<!-- If there is a paper or blog post introducing the model, the APA and Bibtex information for that should go in this section. -->
 **BibTeX:**
-[More Information Needed]
-**APA:**
-[More Information Needed]
-# Glossary [optional]
-<!-- If relevant, include terms and calculations in this section that can help readers understand the model or model card. -->
-[More Information Needed]
-# More Information [optional]
-[More Information Needed]
-# Model Card Authors [optional]
-[More Information Needed]
 # Model Card Contact
-[More Information Needed]
-# How to Get Started with the Model
-Use the code below to get started with the model.
-<details>
-<summary> Click to expand </summary>
-[More Information Needed]
 </details>

 ---
+license: apache-2.0
+language: en
+tags:
+- deberta-v3-base
+- text-classification
+- nli
+- natural-language-inference
+- multitask
+- extreme-mtl
+pipeline_tag: zero-shot-classification
 datasets:
 - hellaswag
 - ag_news
 - winogrande
 - relbert/lexical_relation_classification
 - metaeval/linguisticprobing
+metrics:
+- accuracy
+library_name: transformers
 ---
+# Model Card for DeBERTa-v3-base-tasksource-nli
+DeBERTa model jointly fine-tuned on 444 tasks of the tasksource collection https://github.com/sileod/tasksource/
+This is the model with the MNLI classifier on top. Its encoder was trained on many datasets including bigbench, Anthropic/hh-rlhf... alongside many NLI and classification tasks with a SequenceClassification heads while using only one shared encoder.
+Each task had a specific CLS embedding, which is dropped 10% of the time to facilitate model use without it. All multiple-choice model used the same classification layers. For classification tasks, models shared weights if their labels matched.
+The number of examples per task was capped to 64. The model was trained for 20k steps with a batch size of 384, a peak learning rate of 2e-5.
+You can fine-tune this model to use it for multiple-choice or any classification task (e.g. NLI) like any debertav2 model.
+This model has strong validation performance on many tasks (e.g. 70% on WNLI).
+The list of tasks is available in tasks.md
+code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
+### Software
+https://github.com/sileod/tasknet/
+Training took 3 days on 24GB gpu.
+## Model Recycling
+[Evaluation on 36 datasets](https://ibm.github.io/model-recycling/model_gain_chart?avg=1.41&mnli_lp=nan&20_newsgroup=0.63&ag_news=0.46&amazon_reviews_multi=-0.40&anli=0.94&boolq=2.55&cb=10.71&cola=0.49&copa=10.60&dbpedia=0.10&esnli=-0.25&financial_phrasebank=1.31&imdb=-0.17&isear=0.63&mnli=0.42&mrpc=-0.23&multirc=1.73&poem_sentiment=0.77&qnli=0.12&qqp=-0.05&rotten_tomatoes=0.67&rte=2.13&sst2=0.01&sst_5bins=-0.02&stsb=1.39&trec_coarse=0.24&trec_fine=0.18&tweet_ev_emoji=0.62&tweet_ev_emotion=0.43&tweet_ev_hate=1.84&tweet_ev_irony=1.43&tweet_ev_offensive=0.17&tweet_ev_sentiment=0.08&wic=-1.78&wnli=3.03&wsc=9.95&yahoo_answers=0.17&model_name=sileod%2Fdeberta-v3-base_tasksource-420&base_name=microsoft%2Fdeberta-v3-base) using sileod/deberta-v3-base_tasksource-420 as a base model yields average score of 80.45 in comparison to 79.04 by microsoft/deberta-v3-base.
+An earlier (weaker) version model is ranked 1st among all tested models for the microsoft/deberta-v3-base architecture as of 10/01/2023
+Results:
+|   20_newsgroup |   ag_news |   amazon_reviews_multi |    anli |   boolq |      cb |    cola |   copa |   dbpedia |   esnli |   financial_phrasebank |   imdb |   isear |    mnli |    mrpc |   multirc |   poem_sentiment |    qnli |     qqp |   rotten_tomatoes |     rte |    sst2 |   sst_5bins |    stsb |   trec_coarse |   trec_fine |   tweet_ev_emoji |   tweet_ev_emotion |   tweet_ev_hate |   tweet_ev_irony |   tweet_ev_offensive |   tweet_ev_sentiment |     wic |    wnli |     wsc |   yahoo_answers |
+|---------------:|----------:|-----------------------:|--------:|--------:|--------:|--------:|-------:|----------:|--------:|-----------------------:|-------:|--------:|--------:|--------:|----------:|-----------------:|--------:|--------:|------------------:|--------:|--------:|------------:|--------:|--------------:|------------:|-----------------:|-------------------:|----------------:|-----------------:|---------------------:|---------------------:|--------:|--------:|--------:|----------------:|
+|         87.042 |      90.9 |                  66.46 | 59.7188 | 85.5352 | 85.7143 | 87.0566 |     69 |   79.5333 | 91.6735 |                   85.8 | 94.324 | 72.4902 | 90.2055 | 88.9706 |   63.9851 |             87.5 | 93.6299 | 91.7363 |           91.0882 | 84.4765 | 95.0688 |     56.9683 | 91.6654 |            98 |        91.2 |           46.814 |            84.3772 |         58.0471 |            81.25 |              85.2326 |              71.8821 | 69.4357 | 73.2394 | 74.0385 |            72.2 |
+For more information, see: [Model Recycling](https://ibm.github.io/model-recycling/)
 # Citation [optional]
 **BibTeX:**
+```bib
+@misc{sileod23-tasksource,
+  author = {Sileo, Damien},
+  doi = {10.5281/zenodo.7473446},
+  month = {01},
+  title = {{tasksource: preprocessings for reproducibility and multitask-learning}},
+  url = {https://github.com/sileod/tasksource},
+  version = {1.5.0},
+  year = {2023}}
+```
 # Model Card Contact
+damien.sileo@inria.fr
 </details>