somosnlp-hackathon-2022
/

bertin-roberta-base-zeroshot-esnli

@@ -13,10 +13,13 @@ widget:
 ---
-# A zero-shot classifier based on bertin-roberta-base-finetuning-esnli
 ## Usage (HuggingFace Transformers)
-Without [sentence-transformers](https://www.SBERT.net), you can use the model like this: First, you pass your input through the transformer model, then you have to apply the right pooling-operation on-top of the contextualized word embeddings.
 ```python
 from transformers import pipeline
@@ -26,7 +29,7 @@ classifier = pipeline("zero-shot-classification",
 classifier(
     "El autor se perfila, a los 50 años de su muerte, como uno de los grandes de su siglo",
     candidate_labels=["cultura", "sociedad", "economia", "salud", "deportes"],
-    hypothesis_template="Este ejemplo es {}."
 )
 ```
@@ -34,6 +37,8 @@ The `hypothesis_template` parameter is important and should be in Spanish. **In
 ## Training
 **Dataset**
 We used a collection of datasets of Natural Language Inference as training data:
@@ -41,15 +46,7 @@ We used a collection of datasets of Natural Language Inference as training data:
  - [SNLI](https://nlp.stanford.edu/projects/snli/), automatically translated
  - [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/), automatically translated
-The whole dataset used is available [here](https://huggingface.co/datasets/hackathon-pln-es/ESnli).
-## Full Model Architecture
-```
-SentenceTransformer(
-  (0): Transformer({'max_seq_length': 514, 'do_lower_case': False}) with Transformer model: RobertaModel
-  (1): Pooling({'word_embedding_dimension': 768, 'pooling_mode_cls_token': False, 'pooling_mode_mean_tokens': True, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False})
-)
-```
 ## Authors

 ---
+# A zero-shot classifier based on bertin-roberta-base-spanish
+This model was trained on the basis of the model `bertin-roberta-base-spanish` using **Cross encoder** for NLI task. A CrossEncoder takes a sentence pair as input and outputs a label so it learns to predict the labels: "contradiction": 0, "entailment": 1, "neutral": 2.
+You can use it with Hugging Face's Zero-shot pipeline to make **zero-shot classifications**. Given a sentence and an arbitrary set of labels/topics, it will output the likelihood of the sentence belonging to each of the topic.
 ## Usage (HuggingFace Transformers)
+The simplest way to use the model is the huggingface transformers pipeline tool. Just initialize the pipeline specifying the task as "zero-shot-classification" and select "hackathon-pln-es/bertin-roberta-base-zeroshot-esnli" as model.
 ```python
 from transformers import pipeline
 classifier(
     "El autor se perfila, a los 50 años de su muerte, como uno de los grandes de su siglo",
     candidate_labels=["cultura", "sociedad", "economia", "salud", "deportes"],
+    hypothesis_template="Esta oración es sobre {}."
 )
 ```
 ## Training
+We used [sentence-transformers](https://www.SBERT.net) to train the model.
 **Dataset**
 We used a collection of datasets of Natural Language Inference as training data:
  - [SNLI](https://nlp.stanford.edu/projects/snli/), automatically translated
  - [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/), automatically translated
+The whole dataset used is available [here](https://huggingface.co/datasets/hackathon-pln-es/nli-es).
 ## Authors