Spaces:

lsy641
/

distinct

Runtime error

lsy641 commited on Jul 7, 2023

Commit

3df4666

•

1 Parent(s): 263d47a

distinct

Files changed (2) hide show

README.md CHANGED Viewed

@@ -17,6 +17,16 @@ pinned: false
 ***Module Card Instructions:***
 ## Measurement Description
 This metric is used to calculate the diversity of a group of sentences. It can be used to either evaluate the diversity of generated responses of the testset (i.e., corpus-level diversity), or calculate diversity of a group of sampled responses given one context (i.e., utterence-level diversity). The  [original paper](https://aclanthology.org/N16-1014) (Li et al. 2022) used it as corpus-level while some may use it as utterance-level. However, we don't recommend to calculate Distinct on a small group as it is sensitive to the sentence length and number.
 ## How to Use

 ***Module Card Instructions:***
 ## Measurement Description
+Distinct metric is to calculate the diversity of language. We provide two versions of distinct score. Expectation-Adjusted-Distinct (EAD) is the default one, which removes the biases of the original distinct score on lengthier sentences (see Figure below). Distinct is the original version.
+<p align="center">
+    <img src="https://huggingface.co/spaces/lsy641/distinct/resolve/main/distinct_compare_pic.jpg" alt="drawing" width="350" style="float: center;"/>
+</p>
+For the use of Expectation-Adjusted-Distinct, vocab_size is required.
+Please follow ACL paper https://aclanthology.org/2022.acl-short.86 for motivation and follow the rules of thumb provided by https://github.com/lsy641/Expectation-Adjusted-Distinct/blob/main/EAD.ipynb to determine the vocab_size.
 This metric is used to calculate the diversity of a group of sentences. It can be used to either evaluate the diversity of generated responses of the testset (i.e., corpus-level diversity), or calculate diversity of a group of sampled responses given one context (i.e., utterence-level diversity). The  [original paper](https://aclanthology.org/N16-1014) (Li et al. 2022) used it as corpus-level while some may use it as utterance-level. However, we don't recommend to calculate Distinct on a small group as it is sensitive to the sentence length and number.
 ## How to Use

distinct.py CHANGED Viewed

@@ -54,16 +54,6 @@ _DESCRIPTION = """\
 Distinct metric is to calculate corpus-level diversity of language. We provide two versions of distinct score. Expectation-Adjusted-Distinct (EAD) is the default one, which removes
 the biases of the original distinct score on lengthier sentences (see Figure below). Distinct is the original version.
-For the use of Expectation-Adjusted-Distinct, vocab_size is required.
-Please follow ACL paper https://aclanthology.org/2022.acl-short.86 for motivation and follow the rules of thumb provided by https://github.com/lsy641/Expectation-Adjusted-Distinct/blob/main/EAD.ipynb to determine the vocab_size
-<p align="center">
-    <img src="https://huggingface.co/spaces/lsy641/distinct/resolve/main/distinct_compare_pic.jpg" alt="drawing" width="350" style="float: center;"/>
-</p>
 """

 Distinct metric is to calculate corpus-level diversity of language. We provide two versions of distinct score. Expectation-Adjusted-Distinct (EAD) is the default one, which removes
 the biases of the original distinct score on lengthier sentences (see Figure below). Distinct is the original version.
 """