MoritzLaurer
/

deberta-v3-base-zeroshot-v2.0-c

@@ -7,6 +7,9 @@ tags:
 pipeline_tag: zero-shot-classification
 library_name: transformers
 license: mit
 ---
 # Model description:  deberta-v3-base-zeroshot-v2.0
@@ -61,7 +64,7 @@ print(output)
 The model was evaluated on 28 different text classification tasks with the [balanced_accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.balanced_accuracy_score.html) metric.
 The main reference point is `facebook/bart-large-mnli` which is at the time of writing (27.03.24) the most used commercially-friendly 0-shot classifier.
-The different `...zeroshot-v2.0` models were all trained with the same data and the only difference the the underlying foundation model.
 ![results_aggreg_v2.0](https://raw.githubusercontent.com/MoritzLaurer/zeroshot-classifier/e859471dd183ad44b705c047130433301386aab8/v2_synthetic_data/results/zeroshot-v2.0-aggreg.png)
@@ -101,17 +104,19 @@ The different `...zeroshot-v2.0` models were all trained with the same data and
 ## When to use which model
-- deberta-v3 vs. roberta: deberta-v3 performs clearly better than roberta, but it is slower.
 roberta is directly compatible with Hugging Face's production inference TEI containers and flash attention.
 These containers are a good choice for production use-cases. tl;dr: For accuracy, use a deberta-v3 model.
 If production inference speed is a concern, you can consider a roberta model (e.g. in a TEI container and [HF Inference Endpoints](https://ui.endpoints.huggingface.co/catalog)).
 - `zeroshot-v1.1` vs. `zeroshot-v2.0` models: My `zeroshot-v1.1` models (see [Zeroshot Classifier Collection](https://huggingface.co/collections/MoritzLaurer/zeroshot-classifiers-6548b4ff407bb19ff5c3ad6f)))
 perform better on these 28 datasets, but they are trained on several datasets with non-commercial licenses.
 For commercial users, I therefore recommend using a v2.0 model and non-commercial users might get better performance with a v1.1 model.
 ## Reproduction
-Reproduction code is available here, in the `v2_synthetic_data` directory: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main
@@ -165,4 +170,4 @@ classes_verbalized = ["CDU", "SPD", "Greens"]
 zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-base-zeroshot-v2.0")
 output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
 print(output)
-```

 pipeline_tag: zero-shot-classification
 library_name: transformers
 license: mit
+datasets:
+- nyu-mll/multi_nli
+- fever
 ---
 # Model description:  deberta-v3-base-zeroshot-v2.0
 The model was evaluated on 28 different text classification tasks with the [balanced_accuracy](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.balanced_accuracy_score.html) metric.
 The main reference point is `facebook/bart-large-mnli` which is at the time of writing (27.03.24) the most used commercially-friendly 0-shot classifier.
+The different `zeroshot-v2.0` models were all trained with the same data and the only difference is the underlying foundation model.
 ![results_aggreg_v2.0](https://raw.githubusercontent.com/MoritzLaurer/zeroshot-classifier/e859471dd183ad44b705c047130433301386aab8/v2_synthetic_data/results/zeroshot-v2.0-aggreg.png)
 ## When to use which model
+- deberta-v3-zeroshot vs. roberta-zeroshot: deberta-v3 performs clearly better than roberta, but it is slower.
 roberta is directly compatible with Hugging Face's production inference TEI containers and flash attention.
 These containers are a good choice for production use-cases. tl;dr: For accuracy, use a deberta-v3 model.
 If production inference speed is a concern, you can consider a roberta model (e.g. in a TEI container and [HF Inference Endpoints](https://ui.endpoints.huggingface.co/catalog)).
 - `zeroshot-v1.1` vs. `zeroshot-v2.0` models: My `zeroshot-v1.1` models (see [Zeroshot Classifier Collection](https://huggingface.co/collections/MoritzLaurer/zeroshot-classifiers-6548b4ff407bb19ff5c3ad6f)))
 perform better on these 28 datasets, but they are trained on several datasets with non-commercial licenses.
 For commercial users, I therefore recommend using a v2.0 model and non-commercial users might get better performance with a v1.1 model.
+- The latest updates on new models are always available in the [Zeroshot Classifier Collection](https://huggingface.co/collections/MoritzLaurer/zeroshot-classifiers-6548b4ff407bb19ff5c3ad6f).
 ## Reproduction
+Reproduction code is available in the `v2_synthetic_data` directory here: https://github.com/MoritzLaurer/zeroshot-classifier/tree/main
 zeroshot_classifier = pipeline("zero-shot-classification", model="MoritzLaurer/deberta-v3-base-zeroshot-v2.0")
 output = zeroshot_classifier(text, classes_verbalized, hypothesis_template=hypothesis_template, multi_label=False)
 print(output)
+```