lsy641 commited on
Commit
4e4554d
β€’
1 Parent(s): 5cf4ae2

update README

Browse files
Files changed (1) hide show
  1. README.md +26 -9
README.md CHANGED
@@ -14,15 +14,31 @@ pinned: false
14
 
15
  # Measurement Card for distinct
16
 
17
- ***Module Card Instructions:*** *Fill out the following subsections. Feel free to take a look at existing measurement cards if you'd like examples.*
18
 
19
  ## Measurement Description
20
- *Give a brief overview of this measurement, including what task(s) it is usually used for, if any.*
21
 
22
  ## How to Use
23
- *Give general statement of how to use the measurement*
24
 
25
- *Provide simplest possible example for using the measurement*
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
 
27
  ### Inputs
28
  *List all input arguments in the format below*
@@ -34,9 +50,10 @@ pinned: false
34
 
35
  ### Output Values
36
 
37
- *Explain what this measurement outputs and provide an example of what the measurement output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}*
38
-
39
- *State the range of possible values that the measurement's output can take, as well as what in that range is considered good. For example: "This measurement can take on any value between 0 and 100, inclusive. Higher scores are better."*
 
40
 
41
  #### Values from Popular Papers
42
  The [Expectation-Adjusted-Distinct paper](https://aclanthology.org/2022.acl-short.86) (Liu and Sabour et al. 2022) compares Expectation-Adjusted-Distinct scores of ten different methods with the original Distinct. These scores get higher human correlation from 0.56 to 0.65.
@@ -64,7 +81,7 @@ Example of calculate original Distinct. This will return Distinct-1,2,and 3.
64
  ```
65
 
66
  ## Limitations and Bias
67
-
68
 
69
  ## Citation
70
  ```bibtex
@@ -100,4 +117,4 @@ Example of calculate original Distinct. This will return Distinct-1,2,and 3.
100
  ```
101
 
102
  ## Further References
103
- *Add any useful further references.*
 
14
 
15
  # Measurement Card for distinct
16
 
17
+ ***Module Card Instructions:***
18
 
19
  ## Measurement Description
20
+ This metric is used to calculate the diversity of a group of sentences. It can be used to evaluate the diversity of generated responses on the testset (i.e., corpus level diversity). The original paper only used it as corpus-level while some may use it to calculate diversity of several sampled responses given on context (i.e., utterence level diversity). However, we don't recommend to calculate Distinct on a small group as it is sensitive to sentence length and number.
21
 
22
  ## How to Use
 
23
 
24
+ ```python
25
+ >>> import evaluate
26
+ >>> results = my_new_module.compute(predictions=["Hi.", "I am sorry to hear that", "I don't know", "Do you know who that person is?"], vocab
27
+ _size=50257)
28
+ >>> my_new_module = evaluate.load("lsy641/distinct")
29
+ Downloading builder script: 100%|β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ| 8.62k/8.62k [00:00<00:00, 4.19MB/s]
30
+ >>> results = my_new_module.compute(predictions=["Hi.", "I am sorry to hear that", "I don't know", "Do you know who that person is?"], vocab_size=50257)
31
+ >>> print(results)
32
+ {'Expectation-Adjusted-Distinct': 0.8236605104867569, 'Distinct-1': 0.8235294117647058, 'Distinct-2': 0.9411764705882353, 'Distinct-3': 0.9411764705882353}
33
+
34
+
35
+ >>> dataset = ["This is my friend jack", "I'm sorry to hear that", "But you know I am the one who always support you", "Welcome to our family","Hi.", "I am sorry to hear that", "I don't know", "Do you know who that person is?"]
36
+ >>> results = my_new_module.compute(predictions=["But you know I am the one who always support you", "Hi.", "I am sorry to hear that", "I don't know", "I'm sorry to hear that"], dataForVocabCal=dataset)
37
+ >>> print(results)
38
+ {'Expectation-Adjusted-Distinct': 0.9928137111900845, 'Distinct-1': 0.6538461538461539, 'Distinct-2': 0.8076923076923077, 'Distinct-3': 0.8846153846153846}
39
+
40
+
41
+ ```
42
 
43
  ### Inputs
44
  *List all input arguments in the format below*
 
50
 
51
  ### Output Values
52
 
53
+ - Expectation-Adjusted-Distinct: Normally it should stay in range 0-1. But it can be more than 1. See the formula property in the [Expectation-Adjusted-Distinct paper](https://aclanthology.org/2022.acl-short.86) (Liu and Sabour et al. 2022)
54
+ - Distinct-1: Range 0-1
55
+ - Distinct-2: Range 0-1
56
+ - Distinct-3: Range 0-1
57
 
58
  #### Values from Popular Papers
59
  The [Expectation-Adjusted-Distinct paper](https://aclanthology.org/2022.acl-short.86) (Liu and Sabour et al. 2022) compares Expectation-Adjusted-Distinct scores of ten different methods with the original Distinct. These scores get higher human correlation from 0.56 to 0.65.
 
81
  ```
82
 
83
  ## Limitations and Bias
84
+ TODO
85
 
86
  ## Citation
87
  ```bibtex
 
117
  ```
118
 
119
  ## Further References
120
+ TODO