File size: 6,284 Bytes
b732eb4 2790a7b b732eb4 2790a7b b732eb4 2790a7b 4e4554d 2790a7b 7acb4b6 2790a7b 4e4554d 2790a7b 560c74a f85ef97 560c74a 2790a7b 4e4554d 2790a7b 560c74a 2790a7b b0f8f27 560c74a b0f8f27 560c74a 2790a7b 4e4554d 2790a7b 560c74a 5cf4ae2 560c74a 2790a7b 4e4554d |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 |
---
title: distinct
datasets:
- None
tags:
- evaluate
- measurement
description: "TODO: add a description here"
sdk: gradio
sdk_version: 3.19.1
app_file: app.py
pinned: false
---
# Measurement Card for distinct
***Module Card Instructions:***
## Measurement Description
This metric is used to calculate the diversity of a group of sentences. It can be used to either evaluate the diversity of generated responses of the testset (i.e., corpus-level diversity), or calculate diversity of a group of sampled responses given one context (i.e., utterence-level diversity). The [original paper](https://aclanthology.org/N16-1014) (Li et al. 2022) used it as corpus-level while some may use it as utterance-level. However, we don't recommend to calculate Distinct on a small group as it is sensitive to the sentence length and number.
## How to Use
```python
>>> import evaluate
>>> results = my_new_module.compute(predictions=["Hi.", "I am sorry to hear that", "I don't know", "Do you know who that person is?"], vocab
_size=50257)
>>> my_new_module = evaluate.load("lsy641/distinct")
Downloading builder script: 100%|ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 8.62k/8.62k [00:00<00:00, 4.19MB/s]
>>> results = my_new_module.compute(predictions=["Hi.", "I am sorry to hear that", "I don't know", "Do you know who that person is?"], vocab_size=50257)
>>> print(results)
{'Expectation-Adjusted-Distinct': 0.8236605104867569, 'Distinct-1': 0.8235294117647058, 'Distinct-2': 0.9411764705882353, 'Distinct-3': 0.9411764705882353}
>>> dataset = ["This is my friend jack", "I'm sorry to hear that", "But you know I am the one who always support you", "Welcome to our family","Hi.", "I am sorry to hear that", "I don't know", "Do you know who that person is?"]
>>> results = my_new_module.compute(predictions=["But you know I am the one who always support you", "Hi.", "I am sorry to hear that", "I don't know", "I'm sorry to hear that"], dataForVocabCal=dataset)
>>> print(results)
{'Expectation-Adjusted-Distinct': 0.9928137111900845, 'Distinct-1': 0.6538461538461539, 'Distinct-2': 0.8076923076923077, 'Distinct-3': 0.8846153846153846}
```
### Inputs
*List all input arguments in the format below*
- **predictions** *(list of strings): list of sentences to test diversity. Each prediction should be a string.*
- **mode** *(string): 'Expectation-Adjusted-Distinct' or 'Distinct' for diversity calculation. If 'Expectation-Adjusted-Distinct', the scores for both modes will be returned. The default value is 'Expectation-Adjusted-Distinct'*
- **vocab_size** *(int): For calculating 'Expectation-Adjusted-Distinct', either vocab_size or dataForVocabCal should not be None. Default value is None*
- **dataForVocabCal** *(list of string): dataForVocabCal for calculating the vocab_size for 'Expectation-Adjusted-Distinct'. Typically, it should be a list of sentences consisting the task dataset. For calculating 'Expectation-Adjusted-Distinct', either vocab_size or dataForVocabCal should not be None. Default value is None*
- **tokenizer** *(string or tokenizer class): tokenizer for splitting sentences into words. Default value is "white_space". NLTK tokenizer is available.*
### Output Values
- Expectation-Adjusted-Distinct: Normally it should stay in range 0-1. But it can be more than 1. See the formula property in the [Expectation-Adjusted-Distinct paper](https://aclanthology.org/2022.acl-short.86) (Liu and Sabour et al. 2022)
- Distinct-1: Range 0-1
- Distinct-2: Range 0-1
- Distinct-3: Range 0-1
#### Values from Popular Papers
The [Expectation-Adjusted-Distinct paper](https://aclanthology.org/2022.acl-short.86) (Liu and Sabour et al. 2022) compares Expectation-Adjusted-Distinct scores of ten different methods with the original Distinct. These scores get higher human correlation from 0.56 to 0.65.
### Examples
Example of calculating Expectation-Adjusted-Distinct, given either voab_size or data for vocab_size calculation. Besides returning Expectation-Adjusted-Distinct, this mode will also return Distinct-1, 2, and 3.
```python
>>> my_new_module = evaluate.load("lsy641/distinct")
>>> results = my_new_module.compute(references=["Hi.", "I'm sorry to hear that", "I don't know"], vocab_size=50257)
>>> print(results)
>>> dataset = ["This is my friend jack", "I'm sorry to hear that", "But you know I am the one who always support you", "Welcome to our family"]
>>> results = my_new_module.compute(references=["Hi.", "I'm sorry to hear that", "I don't know"], dataForVocabCal = dataset)
>>> print(results)
```
Example of calculating original Distinct. This will return Distinct-1,2,and 3.
```python
>>> my_new_module = evaluate.load("lsy641/distinct")
>>> results = my_new_module.compute(references=["Hi.", "I'm sorry to hear that", "I don't know"], mode="Distinct")
>>> print(results)
```
## Limitations and Bias
TODO
## Citation
```bibtex
@inproceedings{liu-etal-2022-rethinking,
title = "Rethinking and Refining the Distinct Metric",
author = "Liu, Siyang and
Sabour, Sahand and
Zheng, Yinhe and
Ke, Pei and
Zhu, Xiaoyan and
Huang, Minlie",
booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
year = "2022",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2022.acl-short.86",
doi = "10.18653/v1/2022.acl-short.86",
}
```
```bibtex
@inproceedings{li-etal-2016-diversity,
title = "A Diversity-Promoting Objective Function for Neural Conversation Models",
author = "Li, Jiwei and
Galley, Michel and
Brockett, Chris and
Gao, Jianfeng and
Dolan, Bill",
booktitle = "Proceedings of the 2016 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies",
year = "2016",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/N16-1014",
doi = "10.18653/v1/N16-1014",
}
```
## Further References
TODO
|