sileod commited on
Commit
cfe4502
1 Parent(s): 66dc4b6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -156,9 +156,9 @@ library_name: transformers
156
 
157
  # Model Card for DeBERTa-v3-base-tasksource-nli
158
 
159
- DeBERTa pretrained model jointly fine-tuned on 444 tasks of the [tasksource collection](https://github.com/sileod/tasksource/)
160
- You can fine-tune this model to use it for any classification or multiple-choice task, like any deberta model.
161
- This model has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI).
162
  The untuned model CLS embedding also has strong linear probing performance (90% on MNLI), due to the multitask training.
163
 
164
  This is the shared model with the MNLI classifier on top. Its encoder was trained on many datasets including bigbench, Anthropic/hh-rlhf... alongside many NLI and classification tasks with a SequenceClassification heads while using only one shared encoder.
@@ -167,12 +167,12 @@ The number of examples per task was capped to 64k. The model was trained for 20k
167
 
168
  The list of tasks is available in tasks.md
169
 
170
- code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
171
 
172
  ### Software
173
 
174
  https://github.com/sileod/tasknet/
175
- Training took 7 days on 24GB gpu.
176
 
177
  ## Model Recycling
178
  An earlier (weaker) version model is ranked 1st among all models with the microsoft/deberta-v3-base architecture as of 10/01/2023
 
156
 
157
  # Model Card for DeBERTa-v3-base-tasksource-nli
158
 
159
+ DeBERTa-v3-base fine-tuned jointly fine-tuned on 444 tasks of the [tasksource collection](https://github.com/sileod/tasksource/)
160
+ You can fine-tune this model to use it for any classification or multiple-choice task.
161
+ This checkpoint has strong zero-shot validation performance on many tasks (e.g. 70% on WNLI).
162
  The untuned model CLS embedding also has strong linear probing performance (90% on MNLI), due to the multitask training.
163
 
164
  This is the shared model with the MNLI classifier on top. Its encoder was trained on many datasets including bigbench, Anthropic/hh-rlhf... alongside many NLI and classification tasks with a SequenceClassification heads while using only one shared encoder.
 
167
 
168
  The list of tasks is available in tasks.md
169
 
170
+ tasksource training code: https://colab.research.google.com/drive/1iB4Oxl9_B5W3ZDzXoWJN-olUbqLBxgQS?usp=sharing
171
 
172
  ### Software
173
 
174
  https://github.com/sileod/tasknet/
175
+ Training took 7 days on RTX6000 24GB gpu.
176
 
177
  ## Model Recycling
178
  An earlier (weaker) version model is ranked 1st among all models with the microsoft/deberta-v3-base architecture as of 10/01/2023