MoritzLaurer
/

mDeBERTa-v3-base-mnli-xnli

Zero-Shot Classification

text-classification

Inference Endpoints

Model card Files Files and versions Community

MoritzLaurer HF staff commited on Mar 12, 2022

Commit

614e448

•

1 Parent(s): 7de8833

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -64,7 +64,7 @@ print(prediction)
 This model was trained on the XNLI development dataset and the MNLI train dataset. The XNLI development set consists of 5010 professionally translated texts for each of 15 languages (see [this paper](https://arxiv.org/pdf/1809.05053.pdf)). Note that the XNLI contains a training set of 15 machine translated versions of the MNLI dataset for 15 languages, but due to quality issues with these machine translations, this model was only trained on the professional translations from the XNLI development set and the original English MNLI training set (392 702 texts). Not using machine translated texts can avoid overfitting the model to the 15 languages; avoids catastrophic forgetting of the other 85 languages mDeBERTa was pre-trained on; and significantly reduces training costs.
 ### Training procedure
-DeBERTa-v3-base-mnli was trained using the Hugging Face trainer with the following hyperparameters.
 ```
 training_args = TrainingArguments(
     num_train_epochs=2,              # total number of training epochs

 This model was trained on the XNLI development dataset and the MNLI train dataset. The XNLI development set consists of 5010 professionally translated texts for each of 15 languages (see [this paper](https://arxiv.org/pdf/1809.05053.pdf)). Note that the XNLI contains a training set of 15 machine translated versions of the MNLI dataset for 15 languages, but due to quality issues with these machine translations, this model was only trained on the professional translations from the XNLI development set and the original English MNLI training set (392 702 texts). Not using machine translated texts can avoid overfitting the model to the 15 languages; avoids catastrophic forgetting of the other 85 languages mDeBERTa was pre-trained on; and significantly reduces training costs.
 ### Training procedure
+mDeBERTa-v3-base-mnli-xnli was trained using the Hugging Face trainer with the following hyperparameters.
 ```
 training_args = TrainingArguments(
     num_train_epochs=2,              # total number of training epochs