giotvr
/

xlm_roberta_base_assin_fine_tuned

@@ -45,9 +45,13 @@ the **ASSIN (Avaliação de Similaridade Semântica e Inferência textual)** cor
 ### Direct Use
 This fine-tuned version of [XLM-RoBERTa-base](https://huggingface.co/xlm-roberta-base) performs Natural
-Language Inference (NLI), which is a text classification task.
-<div id="assin_function">
 **Definition 1.** Given a pair of sentences $(premise, hypothesis)$, let $\hat{f}^{(xlmr\_base)}$ be the fine-tuned models' inference function:
@@ -59,12 +63,12 @@ PARAPHRASE, & \text{if $premise$ entails $hypothesis$ and $hypothesis$ entails $
 NONE & \text{otherwise}
 \end{cases}
 $$
-</div>
-The $(premise, hypothesis)$ entailment definition used is the same as the one found in Salvatore's paper [1].
-Therefore, **this fine-tuned version of [XLM-RoBERTa-base](https://huggingface.co/xlm-roberta-base) classifies pairs of sentences into one of the following classes $ENTAILMENT, PARAPHRASE$ or $NONE$.** using [Definition 1](#assin_function).
 <!-- ## Bias, Risks, and Limitations
@@ -135,15 +139,13 @@ The model's fine-tuning procedure can be summarized in three major subsequent ta
         using the *cross-tests* approach described in the [this section](#evaluation), the models' performance were measured using different datasets and metrics.
     </ol>
-More information on the fine-tuning procedure can be found in [@tcc_paper].
 <!--  ##### Column Renaming
 The **Hugging Face**'s ```transformers``` module's ```DataCollator``` used by its ```Trainer``` requires that the ```class label``` column of the collated dataset to be called ```label```.  [ASSIN](https://huggingface.co/datasets/assin)'s class label column for each hypothesis/premise pair is called ```entailment_judgement```. Therefore, as the first step of the data preprocessing pipeline the column  ```entailment_judgement``` was renamed to ```label``` so that the **Hugging Face**'s ```transformers``` module's ```Trainer``` could be used. -->
 #### Hyperparameter Tuning
-The model's training hyperparameters were chosen according to the following definition:
 <div id="hyperparameter_tuning">
@@ -152,13 +154,13 @@ The model's training hyperparameters were chosen according to the following defi
 $$
 Hyperparms = \argmax_{hyp}(eval\_acc(\hat{f}^{(xlmr\_base)}_{hyp}, assin\_validation))
 $$
-</div>
 The following hyperparameters were tested in order to maximize the evaluation accuracy.
-- **Number of Training Epochs:** $(1,2,3)$
-- **Per Device Train Batch Size:** $(16,32)$
-- **Learning Rate:** $(1e-6, 2e-6,3e-6)$
 The hyperaparemeter tuning experiments were run and tracked using the [Weights & Biases' API](https://docs.wandb.ai/ref/python/public-api/api) and can be found at this [link](https://wandb.ai/gio_projs/assin_xlm_roberta_v5?workspace=user-giogvn).
@@ -168,23 +170,22 @@ The hyperaparemeter tuning experiments were run and tracked using the [Weights &
 The [hyperparameter tuning](#hyperparameter-tuning) performed yelded the following values:
-- **Number of Training Epochs:** $3$
-- **Per Device Train Batch Size:** $16$
-- **Learning Rate:** $3e-6$
 ## Evaluation
 ### ASSIN
-Testing this model in [ASSIN](https://huggingface.co/datasets/assin)'s test split is straightforward. The following code snippet shows how to do it:
 ### ASSIN2
-Given a pair of sentences $(premise, hypothesis)$, $\hat{f}^{(xlmr\_base)}(premise, hypothesis)$ can be equal to $PARAPHRASE, ENTAILMENT$ or $NONE$ as defined in [Definition 1](#assin_function).
 [ASSIN2](https://huggingface.co/datasets/assin2)'s test split's class label's column has only two possible values: $ENTAILMENT$ and $NONE$. Therefore, in order to test this model in [ASSIN2](https://huggingface.co/datasets/assin2)'s test split some mapping must be done in order to make the [ASSIN2](https://huggingface.co/datasets/assin2)' class labels compatible with the model's inference function.
-More information on how such mapping is performed can be found in [Modelos para Inferência em Linguagem Natural que entendem a Língua Portuguesa](https://linux.ime.usp.br/~giovani/).
 ### Metrics

 ### Direct Use
 This fine-tuned version of [XLM-RoBERTa-base](https://huggingface.co/xlm-roberta-base) performs Natural
+Language Inference (NLI), which is a text classification task. Therefore, classifies pairs of sentences in the form *(premise, hypothesis)* into one of the following classes *ENTAILMENT, PARAPHRASE* or *NONE*. Salvatore's definition [1] for *ENTAILEMENT* is assumed to be the same as the one found in [ASSIN](https://huggingface.co/datasets/assin)'s labels.
+*PARAPHRASE* and *NONE* are not defined in [1].Therefore, it is assumed that in [ASSIN](https://huggingface.co/datasets/assin), given a pair of sentences *(paraphase, hypothesis)*, *hypothesis* is a *PARAPHRASE* of *premise* if *premise* is an *ENTAILMENT* of *hypothesis* and vice-versa. If *(premise, hypothesis)* don't have an *ENTAILMENT* or *PARAPHARSE* relationship, *(premise, hypothesis)*  is classified as *NONE* in [ASSIN](https://huggingface.co/datasets/assin).
+<!-- <div id="assin_function">
 **Definition 1.** Given a pair of sentences $(premise, hypothesis)$, let $\hat{f}^{(xlmr\_base)}$ be the fine-tuned models' inference function:
 NONE & \text{otherwise}
 \end{cases}
 $$
+</div>
+The (premise, hypothesis)$ entailment definition used is the same as the one found in Salvatore's paper [1].-->
 <!-- ## Bias, Risks, and Limitations
         using the *cross-tests* approach described in the [this section](#evaluation), the models' performance were measured using different datasets and metrics.
     </ol>
 <!--  ##### Column Renaming
 The **Hugging Face**'s ```transformers``` module's ```DataCollator``` used by its ```Trainer``` requires that the ```class label``` column of the collated dataset to be called ```label```.  [ASSIN](https://huggingface.co/datasets/assin)'s class label column for each hypothesis/premise pair is called ```entailment_judgement```. Therefore, as the first step of the data preprocessing pipeline the column  ```entailment_judgement``` was renamed to ```label``` so that the **Hugging Face**'s ```transformers``` module's ```Trainer``` could be used. -->
 #### Hyperparameter Tuning
+<!-- The model's training hyperparameters were chosen according to the following definition:
 <div id="hyperparameter_tuning">
 $$
 Hyperparms = \argmax_{hyp}(eval\_acc(\hat{f}^{(xlmr\_base)}_{hyp}, assin\_validation))
 $$
+</div> -->
 The following hyperparameters were tested in order to maximize the evaluation accuracy.
+- **Number of Training Epochs:** (1,2,3)
+- **Per Device Train Batch Size:** (16,32)
+- **Learning Rate:** (1e-6, 2e-6,3e-6)
 The hyperaparemeter tuning experiments were run and tracked using the [Weights & Biases' API](https://docs.wandb.ai/ref/python/public-api/api) and can be found at this [link](https://wandb.ai/gio_projs/assin_xlm_roberta_v5?workspace=user-giogvn).
 The [hyperparameter tuning](#hyperparameter-tuning) performed yelded the following values:
+- **Number of Training Epochs:** 3
+- **Per Device Train Batch Size:** 16
+- **Learning Rate:** 3e-6
 ## Evaluation
 ### ASSIN
+Testing this model in [ASSIN](https://huggingface.co/datasets/assin)'s test split is straightforward because this model was tested using [ASSIN](https://huggingface.co/datasets/assin)'s training set and therefore can predict the same labels as the ones found in its test set.
 ### ASSIN2
+<!-- Given a pair of sentences $(premise, hypothesis)$, $\hat{f}^{(xlmr\_base)}(premise, hypothesis)$ can be equal to $PARAPHRASE, ENTAILMENT$ or $NONE$ as defined in [Definition 1](#assin_function). -->
 [ASSIN2](https://huggingface.co/datasets/assin2)'s test split's class label's column has only two possible values: $ENTAILMENT$ and $NONE$. Therefore, in order to test this model in [ASSIN2](https://huggingface.co/datasets/assin2)'s test split some mapping must be done in order to make the [ASSIN2](https://huggingface.co/datasets/assin2)' class labels compatible with the model's inference function.
+More information on how such mapping is performed will be available in the [referred paper](#model-sources).
 ### Metrics