cmarkea
/

bloomz-560m-nli

Zero-Shot Classification

text-classification

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Cyrile commited on Mar 17

Commit

e5a72e3

•

1 Parent(s): cbd4009

Update README.md

Files changed (1) hide show

README.md +4 -0

README.md CHANGED Viewed

@@ -14,6 +14,10 @@ We introduce the Bloomz-560m-NLI model, fine-tuned on the [Bloomz-560m-chat-dpo]
 ## Zero-shot Classification
 The primary appeal of training such models lies in their zero-shot classification performance. This means the model is capable of classifying any text with any label without specific training. What sets the Bloomz-560m-NLI LLMs apart in this realm is their ability to model and extract information from significantly more complex and lengthy test structures compared to models like BERT, RoBERTa, or CamemBERT.
 ```python
 from transformers import pipeline

 ## Zero-shot Classification
 The primary appeal of training such models lies in their zero-shot classification performance. This means the model is capable of classifying any text with any label without specific training. What sets the Bloomz-560m-NLI LLMs apart in this realm is their ability to model and extract information from significantly more complex and lengthy test structures compared to models like BERT, RoBERTa, or CamemBERT.
+The zero-shot classification task can be summarized by:
+$$P(hypothesis=i\in\mathcal{C}|premise)=\frac{e^{P(premise=entailment\vert hypothesis=i)}}{\sum_{j\in\mathcal{C}}e^{P(premise=entailment\vert hypothesis=j)}}$$
+With *i* representing a hypothesis composed of a template (for example, "This text is about {}.") and *#C* candidate labels ("cinema", "politics", etc.), the set of hypotheses comprises {"This text is about cinema.", "This text is about politics.", ...}. It is these hypotheses that we will measure against the premise, which is the sentence we aim to classify.
 ```python
 from transformers import pipeline