sileod
/

deberta-v3-base-tasksource-nli

Model card Files Files and versions Community

sileod commited on Jan 26, 2023

Commit

cfe4502

•

1 Parent(s): 66dc4b6

Update README.md

Files changed (1) hide show

README.md +5 -5

README.md CHANGED Viewed

@@ -156,9 +156,9 @@ library_name: transformers
 # Model Card for DeBERTa-v3-base-tasksource-nli
-DeBERTa pretrained model jointly fine-tuned on 444 tasks of the [tasksource collection](https://github.com/sileod/tasksource/)
-You can fine-tune this model to use it for any classification or multiple-choice task, like any deberta model.
-This model has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI).
 The untuned model CLS embedding also has strong linear probing performance (90% on MNLI), due to the multitask training.
 This is the shared model with the MNLI classifier on top. Its encoder was trained on many datasets including bigbench, Anthropic/hh-rlhf... alongside many NLI and classification tasks with a SequenceClassification heads while using only one shared encoder.
@@ -167,12 +167,12 @@ The number of examples per task was capped to 64k. The model was trained for 20k
 The list of tasks is available in tasks.md
-code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
 ### Software
 https://github.com/sileod/tasknet/
-Training took 7 days on 24GB gpu.
 ## Model Recycling
 An earlier (weaker) version model is ranked 1st among all models with the microsoft/deberta-v3-base architecture as of 10/01/2023

 # Model Card for DeBERTa-v3-base-tasksource-nli
+DeBERTa-v3-base fine-tuned jointly fine-tuned on 444 tasks of the [tasksource collection](https://github.com/sileod/tasksource/)
+You can fine-tune this model to use it for any classification or multiple-choice task.
+This checkpoint has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI).
 The untuned model CLS embedding also has strong linear probing performance (90% on MNLI), due to the multitask training.
 This is the shared model with the MNLI classifier on top. Its encoder was trained on many datasets including bigbench, Anthropic/hh-rlhf... alongside many NLI and classification tasks with a SequenceClassification heads while using only one shared encoder.
 The list of tasks is available in tasks.md
+tasksource training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
 ### Software
 https://github.com/sileod/tasknet/
+Training took 7 days on RTX6000 24GB gpu.
 ## Model Recycling
 An earlier (weaker) version model is ranked 1st among all models with the microsoft/deberta-v3-base architecture as of 10/01/2023