--- pipeline_tag: zero-shot-classification tags: - zero-shot-classification - nli language: - es datasets: - hackathon-pln-es/nli-es widget: - text: "Para detener la pandemia, es importante que todos se presenten a vacunarse." candidate_labels: "salud, deporte, entretenimiento" --- # A zero-shot classifier based on bertin-roberta-base-spanish This model was trained on the basis of the model `bertin-roberta-base-spanish` using **Cross encoder** for NLI task. A CrossEncoder takes a sentence pair as input and outputs a label so it learns to predict the labels: "contradiction": 0, "entailment": 1, "neutral": 2. You can use it with Hugging Face's Zero-shot pipeline to make **zero-shot classifications**. Given a sentence and an arbitrary set of labels/topics, it will output the likelihood of the sentence belonging to each of the topic. ## Usage (HuggingFace Transformers) The simplest way to use the model is the huggingface transformers pipeline tool. Just initialize the pipeline specifying the task as "zero-shot-classification" and select "hackathon-pln-es/bertin-roberta-base-zeroshot-esnli" as model. ```python from transformers import pipeline classifier = pipeline("zero-shot-classification", model="hackathon-pln-es/bertin-roberta-base-zeroshot-esnli") classifier( "El autor se perfila, a los 50 años de su muerte, como uno de los grandes de su siglo", candidate_labels=["cultura", "sociedad", "economia", "salud", "deportes"], hypothesis_template="Esta oración es sobre {}." ) ``` The `hypothesis_template` parameter is important and should be in Spanish. **In the widget on the right, this parameter is set to its default value: "This example is {}.", so different results are expected.** ## Training We used [sentence-transformers](https://www.SBERT.net) to train the model. **Dataset** We used a collection of datasets of Natural Language Inference as training data: - [ESXNLI](https://raw.githubusercontent.com/artetxem/esxnli/master/esxnli.tsv), only the part in spanish - [SNLI](https://nlp.stanford.edu/projects/snli/), automatically translated - [MultiNLI](https://cims.nyu.edu/~sbowman/multinli/), automatically translated The whole dataset used is available [here](https://huggingface.co/datasets/hackathon-pln-es/nli-es). ## Authors - [Anibal Pérez](https://huggingface.co/Anarpego) - [Emilio Tomás Ariza](https://huggingface.co/medardodt) - [Lautaro Gesuelli Pinto](https://huggingface.co/Lautaro) - [Mauricio Mazuecos](https://huggingface.co/mmazuecos)