gorkaartola commited on
Commit
7e0ba0e
1 Parent(s): 1480616

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -21
README.md CHANGED
@@ -21,36 +21,37 @@ This metric is specially designed to measure the performance of sentence classif
21
  In addition to the classical *predictions* and *references* inputs, this metric includes a *kwarg* named *prediction_strategies (list(str))*, that refer to a family of prediction strategies that the metric can handle.
22
 
23
  Add *predictions*, *references* and *prediction_strategies* as follows:
24
-
25
  metric = evaluate.load(metric_selector)
26
  metric.add_batch(predictions = predictions, references = references)
27
  results = metric.compute(prediction_strategies = prediction_strategies)
 
28
 
29
- The *prediction_strategies* implemented in this metric are:
30
- - *argmax*, which takes the highest value of the softmax inference logits to select the prediction.
31
- - *threshold*, which takes all softmax inference logits above a certain value to select the predictions.
32
- - *topk*, which takes the highest *k* softmax inference logits to select the predictions.
33
-
34
- The minimum fields required by this metric for the test datasets are the following (not necessarily with these names):
35
- - *title* containing the first sentence to be compared with different queries representing each class.
36
- - *label_ids* containing the *id* of the class the sample refers to. Including samples of all the classes is advised.
37
- - *nli_label* which is '0' if the sample represents a True Positive or '2' if the sample represents a False Positive, meaning that the *label_ids* is incorrectly assigned to the *title*. Including both True Positive and False Positive samples for all classes is advised.
38
 
39
- Example:
40
-
41
- |title |label_ids |nli_label |
42
- |-----------------------------------------------------------------------------------|:---------:|:----------:|
43
- |'Together we can save the arctic': celebrity advocacy and the Rio Earth Summit 2012| 8 | 0 |
44
- |Tuple-based semantic and structural mapping for a sustainable interoperability | 16 | 2 |
45
 
46
  ### Inputs
47
 
48
  - *predictions*, *(numpy.array(float32)[sentences to classify,number of classes])*: numpy array with the softmax logits values of the entailment dimension of the NLI inference on the sentences to be classified for each class.
49
  - *references* , *(numpy.array(int32)[sentences to classify,2]: numpy array with the reference *label_ids* and *nli_label* of the sentences to be classified, given in the *test_dataset*.
50
- - *kwarg* named *prediction_strategies = list(list(str, int(optional)))*, each *list(list(str, int(optional)))* describing a desired prediction strategy as follows:
51
- + *argmax*: *["argmax"]*.
52
- + *threshold*: *["threshold", desired value]*.
53
- + *topk*: ["topk", desired value]*.
 
 
 
 
 
 
54
 
55
  ### Output Values
56
 
@@ -66,4 +67,4 @@ BibLaTeX
66
  url = {https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples},
67
  urldate = {2022-08-11}
68
  }
69
- ```
 
21
  In addition to the classical *predictions* and *references* inputs, this metric includes a *kwarg* named *prediction_strategies (list(str))*, that refer to a family of prediction strategies that the metric can handle.
22
 
23
  Add *predictions*, *references* and *prediction_strategies* as follows:
24
+ ```
25
  metric = evaluate.load(metric_selector)
26
  metric.add_batch(predictions = predictions, references = references)
27
  results = metric.compute(prediction_strategies = prediction_strategies)
28
+ ```
29
 
30
+ The minimum fields required by this metric for the test datasets are the following (not necessarily with these names):
31
+ - *title* containing the first sentence to be compared with different queries representing each class.
32
+ - *label_ids* containing the *id* of the class the sample refers to. Including samples of all the classes is advised.
33
+ - *nli_label* which is '0' if the sample represents a True Positive or '2' if the sample represents a False Positive, meaning that the *label_ids* is incorrectly assigned to the *title*. Including both True Positive and False Positive samples for all classes is advised.
 
 
 
 
 
34
 
35
+ Example:
36
+ |title |label_ids |nli_label |
37
+ |-----------------------------------------------------------------------------------|:---------:|:----------:|
38
+ |'Together we can save the arctic': celebrity advocacy and the Rio Earth Summit 2012| 8 | 0 |
39
+ |Tuple-based semantic and structural mapping for a sustainable interoperability | 16 | 2 |
 
40
 
41
  ### Inputs
42
 
43
  - *predictions*, *(numpy.array(float32)[sentences to classify,number of classes])*: numpy array with the softmax logits values of the entailment dimension of the NLI inference on the sentences to be classified for each class.
44
  - *references* , *(numpy.array(int32)[sentences to classify,2]: numpy array with the reference *label_ids* and *nli_label* of the sentences to be classified, given in the *test_dataset*.
45
+ - *kwarg* named *prediction_strategies = list(list(str, int(optional)))*, each *list(list(str, int(optional)))* describing a desired prediction strategy. The *prediction_strategies* implemented in this metric are:
46
+ - *argmax*, which takes the highest value of the softmax inference logits to select the prediction. Syntax: *["argmax_max"]*
47
+ - *threshold*, which takes all softmax inference logits above a certain value to select the predictions. Syntax: *["threshold", desired value]*
48
+ - *topk*, which takes the highest *k* softmax inference logits to select the predictions. Syntax: *["topk", desired value]*
49
+
50
+ Example:
51
+ ```
52
+ prediction_strategies = [['argmax_max'],['threshold', 0.5],['topk,3']]
53
+ ```
54
+
55
 
56
  ### Output Values
57
 
 
67
  url = {https://huggingface.co/spaces/gorkaartola/metric_for_tp_fp_samples},
68
  urldate = {2022-08-11}
69
  }
70
+ ```