Spaces:

gorkaartola
/

Zero_Shot_Classifier_Tester_for_TP_FP

Runtime error

App Files Files Community

gorkaartola commited on Aug 13, 2022

Commit

1dccb8f

•

1 Parent(s): 5f789fe

Upload README.md

Browse files

Files changed (1) hide show

README.md +36 -30

README.md CHANGED Viewed

@@ -1,5 +1,5 @@
 ---
-title: Zero Shot Classifier Tester
 emoji: 📊
 colorFrom: purple
 colorTo: yellow
@@ -9,40 +9,34 @@ app_file: app.py
 pinned: false
 ---
-# Card for Zero_Shot_Classifier_Tester
 ## Description
-With this app you can test and compare different zero shot approaches to classify sentences through Natural Language Inference (NLI) comparison of the sentences to be classified and diferent queries that represent each class, evaluating the *entailment* or *contradition* between two sentences.
 ## How to Use
-Please define the configuration to be tested selecting one of the avalable options for *test_dataset*, *metric_selector*, *model_selector*, *queries_selector* and *prompt_selector*, and as many choices of *preditions_strategy_selector*. The available input choices for each parameter for the tests are included in the file [options.py](https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs/blob/main/options.py) or its clone if run the app locally, and can be extended with further posibilities by including them in the same file. To include more choices please insert new keys in the corresponding *dict* included in [options.py](https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs/blob/main/options.py) with a *dict value* non already used in the same *dict*, and complying with the rules described in the *Inputs* chapter of this card for each parameter.
 ### Inputs
-- **metric_selector** *(str or os.PathLike)*: *Evaluation module identifier* on the HuggingFace evaluate repo or local path to the metric script. The metric must include, in addition to the conventinal *predictions* and *references* inputs a *kwarg* named *prediction_strategies (list(str))*. The metric must be able to handle a family of prediction strategies which must be included within the options lists for the parameter *prediction_strategy_selector*. Also the metric defines the minimum characteristics of the data comprised in the *test_dataset* parameter and at least adecuated test dataset must included within the options lists for the parameter *test_dataset*. The app currently includes the following metrics:
-	- [gorkaartola/metric_for_tp_fp_samples](https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples). This metric is specially designed to measure the performance of sentece classification models over multiclass test datasets containing both True Positive samples, meaning that the label associated to the sentence in the sample is correctly assigned, and False Positive samples, meaning that the label associated to the sentence in the sample is incorrectly assigned.
-		The *prediction_strategies* implemented in this metric are:
-		- *argmax*, which takes the highest value of the softmax inference logits to select the prediction.
-		- *threshold*, which takes all softmax inference logits above a certain value to select the predictions.
-		- *topk*, which takes the highest *k* softmax inference logits to select the predictions.
-		The minimum fields required by this metric for the test datasets are the following:
-		- *title* containing the first sentence to be compared with diferent queries representing each class.
-		- *label_ids* containing the *id* of the class the sample refers to. Including samples of all the classes is advised.
-		- *nli_label* which is '0' if the sample represents a True Positive or '2' if the sample represents a False Positive, meaning that the *label_ids* is incorrectly assigned to the *title*. Including both True Positive and False Positive samples for all classes is advised.
-		Example:
-		|title                                                                              |label_ids  |nli_label   |
-		|-----------------------------------------------------------------------------------|:---------:|:----------:|
-		|'Together we can save the arctic': celebrity advocacy and the Rio Earth Summit 2012|     8     |     0      |
-		|Tuple-based semantic and structural mapping for a sustainable interoperability     |     16    |     2      |
-- **test_dataset** *(str or os.PathLike)*: the *name* of a dataset hosted inside a repo on huggingface.co or path to a directory containing the dataset locally. The data used for testing will be only the one included in the *test* split of the dataset, contains the samples of the first sentence to be classified by NLI comparison with different queries and the minimum additional data fields for each sample is defined by the requirements of the selected metric among the ones described above.
-	- [gorkaartola/SC-ZS-test_AURORA-Gold-SDG_True-Positives-and-False-Positives](https://huggingface.co/datasets/gorkaartola/SC-ZS-test_AURORA-Gold-SDG_True-Positives-and-False-Positives). This dataset is shaped to be used with the metric [gorkaartola/metric_for_tp_fp_samples](https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples) and includes both Gold True Positive samples and Gold False Positive samples of titles of scientific papers which are associated to a particular [Sustainable Development Goal of the UN](https://sdgs.un.org/goals) or SDG (True Positives) or are not associated to a particular SDG (False Positives). These assignations are no excluding, being possible that the titles be also related to other SDGs or not related to other SDGs. The data has been hand annotated by experts in the framework of the [AURORA Project](https://sites.google.com/vu.nl/aurora-sdg-research-dashboard/deliverables?authuser=0#h.5lufepxyapac) and published in the paper [Evaluation on accuracy of mapping science to the United Nations' Sustainable Development Goals (SDGs) of the Aurora SDG queries](https://zenodo.org/record/4917171#.YvaH5DqxVH4). The structure of the data includida in the dataset is the following:
 	|SDG      |True Positive Samples|False Positive Samples |
 	|:-------:|:-------------------:|:---------------------:|
@@ -65,9 +59,7 @@ Please define the configuration to be tested selecting one of the avalable optio
 	|17       |29                   |29                     |
 	|Total    |1.043                |1.043                  |
-- **model_selector** *(str or os.PathLike)*: the *model id* of a pretrained model hosted inside a model repo on huggingface.co or path to a directory containing model weights saved using save_pretrained() to models for [Zero-Shot Classification with transformers](https://huggingface.co/models?pipeline_tag=zero-shot-classification&sort=downloads), fine-tuned models based on them, or in general models trained to perform classification through natural language inference NLI.
-- **queries_selector** *(str)*: combination of the the *name* of a dataset hosted inside a repo on huggingface.co followed by "-" and the name of the file that contains the queries to build the second sentence for the NLI comparison together with a *prompt*. The dataset file must include at least the following fields:
 	- *query*, containing each sample a sentence related to a certain class. One class may have multiple query sentences in the dataset and the one selected after inference to measure the *entailment* or *contradiction* of the sentences to be classified with each particular class is the one on which the softmax logits of the inference is the highest of all queries associated to that particular class.
 	- *label_ids*, containing the identification of the class associated to each query.
@@ -89,7 +81,7 @@ Please define the configuration to be tested selecting one of the avalable optio
 	|participation of women                                                             |    4    |
 	|women’s rights                                                                     |    4    |
-	For the representation of each query dataset in the file [options.py](https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs/blob/main/options.py), each dataset *name* is includad as a new *dict_key* containing as value also a *dict* containing the names of diferent .csv files with the queries data included in  the dataset as shown below.
 			queries = {
 				'gorkaartola/SDG_queries':
@@ -116,7 +108,21 @@ Please define the configuration to be tested selecting one of the avalable optio
 	- *None*.
 	- *'This is '*.
 	- *'The subject is '* particularly included for the queries of *SDG_targets.csv* file.
-	- *'The Sustainable Development Goal is '* particlarly included for the queries of the  *SDG_Numbers.csv* file.
 - **prediction_strategy_selector** *(str)*: identifiers oft he strategies implemented in the corresponding *metric* for the selection of the predictions. The strategy choices currently included are:
 	+ For the [gorkaartola/metric_for_tp_fp_samples](https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples) metric:
@@ -125,7 +131,7 @@ Please define the configuration to be tested selecting one of the avalable optio
 		+ *topk*: with valules of 3, 5, 7 and 9. To add a new *topk* strategy a new key can be introduced in the *predition_strategy_options dict* of the [options.py](https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs/blob/main/options.py) file with a list as follows *["topk", desired value]*.
 ### Output
-The output is a .csv file that includes, for every prediction strategy selected, a detailed table of results including, recall, precission, f1-score and accuracy of the predictions for each SDG, and both overall micro and macro averages. If cloned and run in local, this file is saved in the *Reports* folder and a file with the calculated entailment softmax logits for each query is also saved in the *Reports/ZS inference tables* folder. The files included in the Huggingface repo have been uploaded as examples of the results obtained from the app.
 ## Limitations and Bias
 Please refer to the limitations and bias of the models use for the inference
@@ -164,9 +170,9 @@ BibLaTex
 ## Citation
 BibLaTeX
 ```
-@online{ZS_SDGs,
   author = {Gorka Artola},
-  title = {Testing Zero Shot Classification by SDGs},
   year = 2022,
   url = {https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs},
   urldate = {2022-08-11}

 ---
+title: Zero Shot Classifier Tester for True Positive and False Positive Samples
 emoji: 📊
 colorFrom: purple
 colorTo: yellow
 pinned: false
 ---
+# Card for Zero_Shot_Classifier_Tester_for_TP_FP
 ## Description
+With this app you can test and compare different zero shot approaches to classify sentences through Natural Language Inference (NLI) comparison of the sentences to be classified (sentence 1) and sentences (sentence 2) built combining different queries that represent each class that with prompts, evaluating the *entailment* or *contradiction* between sentence 1 and sentence 2, particularly in the case our dataset contains True Positive and False Positive Samples
 ## How to Use
+Please define the configuration to be tested selecting one of the available options for *model_selector*, *test_dataset*, *queries_selector*, *prompt_selector* and *metric_selector*, and as many choices of the available *predictions_strategy_selector*. The available input choices for each parameter for the tests are included in the file [options.py](https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs/blob/main/options.py) or its clone if run the app locally, and can be extended with further possibilities by including them in the same file. To include more choices please insert new keys in the corresponding *dict* included in the [options.py](https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs/blob/main/options.py) file with a *dict value* non already used in the same *dict*, and complying with the rules described in the *Inputs* chapter of this card for each parameter.
 ### Inputs
+- **model_selector** *(str or os.PathLike)*: the *model id* of a pretrained model hosted inside a model repo on huggingface.co or path to a directory containing model weights saved using save_pretrained() to models for [Zero-Shot Classification with transformers](https://huggingface.co/models?pipeline_tag=zero-shot-classification&sort=downloads), fine-tuned models based on them, or in general models trained to perform classification through natural language inference NLI.
+- **test_dataset** *(str or os.PathLike)*: the *name* of a dataset hosted inside a repo on huggingface.co or path to a directory containing the dataset locally. The data used for testing will be only the one included in the *test* split of the dataset, contains the samples of the first sentence (Sentence 1) to be classified by NLI comparison with different queries and the names of the minimum data fields that each sample must have are the following.
+	- *title* containing the first sentence to be compared with different queries representing each class.
+	- *label_ids* containing the *id* of the class the sample refers to. Including samples of all the classes is advised.
+	- *nli_label* which is '0' if the sample represents a True Positive or '2' if the sample represents a False Positive, meaning that the *label_ids* is incorrectly assigned to the *title*. Including both True Positive and False Positive samples for all classes is advised.
+	Example:
+	|title                                                                              |label_ids  |nli_label   |
+	|-----------------------------------------------------------------------------------|:---------:|:----------:|
+	|'Together we can save the arctic': celebrity advocacy and the Rio Earth Summit 2012|     8     |     0      |
+	|Tuple-based semantic and structural mapping for a sustainable interoperability     |     16    |     2      |
+	Currently the available dataset options in the app are:
+	- [gorkaartola/SC-ZS-test_AURORA-Gold-SDG_True-Positives-and-False-Positives](https://huggingface.co/datasets/gorkaartola/SC-ZS-test_AURORA-Gold-SDG_True-Positives-and-False-Positives). This dataset is shaped to be used with the metric [gorkaartola/metric_for_tp_fp_samples](https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples) and includes both Gold True Positive samples and Gold False Positive samples of titles of scientific papers which are associated to a particular [Sustainable Development Goal of the UN](https://sdgs.un.org/goals) or SDG (True Positives) or are not associated to a particular SDG (False Positives). These assignations are no excluding, being possible that the titles be also related to other SDGs or not related to other SDGs. The data has been hand annotated by experts in the framework of the [AURORA Project](https://sites.google.com/vu.nl/aurora-sdg-research-dashboard/deliverables?authuser=0#h.5lufepxyapac) and published in the paper [Evaluation on accuracy of mapping science to the United Nations' Sustainable Development Goals (SDGs) of the Aurora SDG queries](https://zenodo.org/record/4917171#.YvaH5DqxVH4). The structure of the data included in the dataset is the following:
 	|SDG      |True Positive Samples|False Positive Samples |
 	|:-------:|:-------------------:|:---------------------:|
 	|17       |29                   |29                     |
 	|Total    |1.043                |1.043                  |
+- **queries_selector** *(str)*: combination of the the *name* of a dataset hosted inside a repo on huggingface.co followed by "-" and the name of a csv file that contains the queries to build the second sentence (setnence 2) for the NLI comparison together with a *prompt*. The dataset file must include at least the following fields:
 	- *query*, containing each sample a sentence related to a certain class. One class may have multiple query sentences in the dataset and the one selected after inference to measure the *entailment* or *contradiction* of the sentences to be classified with each particular class is the one on which the softmax logits of the inference is the highest of all queries associated to that particular class.
 	- *label_ids*, containing the identification of the class associated to each query.
 	|participation of women                                                             |    4    |
 	|women’s rights                                                                     |    4    |
+	For the representation of each query dataset in the file [options.py](https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs/blob/main/options.py), each dataset *name* is includad as a new *dict_key* containing as value also a *dict* containing the names of different .csv files with the queries data included in the dataset as shown below.
 			queries = {
 				'gorkaartola/SDG_queries':
 	- *None*.
 	- *'This is '*.
 	- *'The subject is '* particularly included for the queries of *SDG_targets.csv* file.
+	- *'The Sustainable Development Goal is '* particularly included for the queries of the  *SDG_Numbers.csv* file.
+- **metric_selector** *(str or os.PathLike)*: *Evaluation module identifier* on the HuggingFace evaluate repo or local path to a metric script. The metric must accept the following inputs:
+	- *predictions*, *(numpy.array(float32)[sentences to classify,number of classes])*: numpy array with the softmax logits values of the entailment dimension of the NLI inference on the sentences to be classified for each class.
+	- *references* , *(numpy.array(int32)[sentences to classify,2]: numpy array with the reference *label_ids* and *nli_label* of the sentences to be classified, given in the *test_dataset*.
+	- A *kwarg* named *prediction_strategies*, (list(str))*. The metric must be able to handle a family of prediction strategies which must be included within the options lists for the parameter *prediction_strategy_selector* in the [options.py](https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs/blob/main/options.py) file.
+	The app currently includes the following metrics:
+	- [gorkaartola/metric_for_tp_fp_samples](https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples). This metric is specially designed to measure the performance of sentence classification models over multiclass test datasets containing both True Positive samples, meaning that the label associated to the sentence in the sample is correctly assigned, and False Positive samples, meaning that the label associated to the sentence in the sample is incorrectly assigned.
+		The *prediction_strategies* implemented in this metric are:
+		- *argmax*, which takes the highest value of the softmax inference logits to select the prediction.
+		- *threshold*, which takes all softmax inference logits above a certain value to select the predictions.
+		- *topk*, which takes the highest *k* softmax inference logits to select the predictions.
 - **prediction_strategy_selector** *(str)*: identifiers oft he strategies implemented in the corresponding *metric* for the selection of the predictions. The strategy choices currently included are:
 	+ For the [gorkaartola/metric_for_tp_fp_samples](https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples) metric:
 		+ *topk*: with valules of 3, 5, 7 and 9. To add a new *topk* strategy a new key can be introduced in the *predition_strategy_options dict* of the [options.py](https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs/blob/main/options.py) file with a list as follows *["topk", desired value]*.
 ### Output
+The output is a .csv file that includes, for every prediction strategy selected, a detailed table of results including, recall, precision, f1-score and accuracy of the predictions for each class, and both overall micro and macro averages. If cloned and run in local, this file is saved in the *Reports* folder and a file with the calculated entailment softmax logits for each query is also saved in the *Reports/ZS inference tables* folder. The files included in the Huggingface repo have been uploaded as examples of the results obtained from the app.
 ## Limitations and Bias
 Please refer to the limitations and bias of the models use for the inference
 ## Citation
 BibLaTeX
 ```
+@online{ZS_classifier_tester,
   author = {Gorka Artola},
+  title = {Zero Shot Classifier Tester for True Positive and False Positive Samples},
   year = 2022,
   url = {https://huggingface.co/spaces/gorkaartola/Zero_Shot_Classifier_by_SDGs},
   urldate = {2022-08-11}