distinct / README.md
lsy641's picture
update README
560c74a
|
raw
history blame
4.75 kB
metadata
title: distinct
datasets:
  - None
tags:
  - evaluate
  - measurement
description: 'TODO: add a description here'
sdk: gradio
sdk_version: 3.19.1
app_file: app.py
pinned: false

Measurement Card for distinct

Module Card Instructions: Fill out the following subsections. Feel free to take a look at existing measurement cards if you'd like examples.

Measurement Description

Give a brief overview of this measurement, including what task(s) it is usually used for, if any.

How to Use

Give general statement of how to use the measurement

Provide simplest possible example for using the measurement

Inputs

List all input arguments in the format below

  • predictions (list of strings): list of sentences to test diversity. Each prediction should be a string.
  • mode (string): 'Expectation-Adjusted-Distinct' or 'Distinct' for diversity calculationg. If the value is 'Expectation-Adjusted-Distinct', the scores of the both modes will be returned. Default value is 'Expectation-Adjusted-Distinct'
  • vocab_size (int): vocab_size for calculating 'Expectation-Adjusted-Distinct'. When calculating 'Expectation-Adjusted-Distinct', either vocab_size or dataForVocabCal should not be None. Default value is None
  • dataForVocabCal (list of string): dataForVocabCal for calculating the vocab_size for 'Expectation-Adjusted-Distinct'. Typically, it should be a list of sentences consisting the task dataset. When calculating 'Expectation-Adjusted-Distinct', either vocab_size or dataForVocabCal should not be None. Default value is None
  • tokenizer (string or tokenizer class): tokenizer for splitting sentences into words. Default value is "white_space". NLTK tokenizer is available.

Output Values

Explain what this measurement outputs and provide an example of what the measurement output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}

State the range of possible values that the measurement's output can take, as well as what in that range is considered good. For example: "This measurement can take on any value between 0 and 100, inclusive. Higher scores are better."

Values from Popular Papers

The Expectation-Adjusted-Distinct paper (Liu and Sabour et al. 2022) compares Expectation-Adjusted-Distinct scores of ten different methods with the original Distinct. These scores get higher human correlation from 0.56 to 0.65.

Examples

Example of calculate Expectation-Adjusted-Distinct byy giving voab_size or data for calculating vocab_size. This will also return Distinct-1,2,and 3.

    >>> my_new_module = evaluate.load("lsy641/distinct")
    >>> results = my_new_module.compute(references=["Hi.", "I'm sorry to hear that", "I don't know"], vocab_size=50257)
    >>> print(results)
    
    
    >>> dataset = ["This is my friend jack", "I'm sorry to hear that", "But you know I am the one who always support you", "Welcome to our family"]
    >>> results = my_new_module.compute(references=["Hi.", "I'm sorry to hear that", "I don't know"], dataForVocabCal = dataset)
    >>> print(results)
    

Example of calculate original Distinct. This will return Distinct-1,2,and 3.

    >>> my_new_module = evaluate.load("lsy641/distinct")   
    >>> results = my_new_module.compute(references=["Hi.", "I'm sorry to hear that", "I don't know"], mode="Distinct")
    >>> print(results) 

Limitations and Bias

Citation

@inproceedings{liu-etal-2022-rethinking,
    title = "Rethinking and Refining the Distinct Metric",
    author = "Liu, Siyang  and
      Sabour, Sahand  and
      Zheng, Yinhe  and
      Ke, Pei  and
      Zhu, Xiaoyan  and
      Huang, Minlie",
    booktitle = "Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)",
    year = "2022",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2022.acl-short.86",
    doi = "10.18653/v1/2022.acl-short.86",
    }
    
@inproceedings{li-etal-2016-diversity,
    title = "A Diversity-Promoting Objective Function for Neural Conversation Models",
    author = "Li, Jiwei  and
      Galley, Michel  and
      Brockett, Chris  and
      Gao, Jianfeng  and
      Dolan, Bill",
    booktitle = "Proceedings of the 2016 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies",
    year = "2016",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/N16-1014",
    doi = "10.18653/v1/N16-1014",
}

Further References

Add any useful further references.