Spaces:

yzha
/

ctc_eval

Runtime error

App Files Files Community

yzha commited on Jun 29, 2022

Commit

630c62a

1 Parent(s): 2c2f56c

update

Browse files

Files changed (1) hide show

README.md +24 -23

README.md CHANGED Viewed

@@ -5,7 +5,8 @@ datasets:
 tags:
 - evaluate
 - metric
-description: "TODO: add a description here"
 sdk: gradio
 sdk_version: 3.0.2
 app_file: app.py
@@ -14,37 +15,37 @@ pinned: false
 # Metric Card for CTC_Eval
-***Module Card Instructions:*** *Fill out the following subsections. Feel free to take a look at existing metric cards if you'd like examples.*
 ## Metric Description
-*Give a brief overview of this metric, including what task(s) it is usually used for, if any.*
 ## How to Use
-*Give general statement of how to use the metric*
-*Provide simplest possible example for using the metric*
 ### Inputs
-*List all input arguments in the format below*
-- **input_field** *(type): Definition of input, with explanation if necessary. State any default value(s).*
 ### Output Values
-*Explain what this metric outputs and provide an example of what the metric output looks like. Modules should return a dictionary with one or multiple key-value pairs, e.g. {"bleu" : 6.02}*
-*State the range of possible values that the metric's output can take, as well as what in that range is considered good. For example: "This metric can take on any value between 0 and 100, inclusive. Higher scores are better."*
-#### Values from Popular Papers
-*Give examples, preferrably with links to leaderboards or publications, to papers that have reported this metric, along with the values they have reported.*
-### Examples
-*Give code examples of the metric being used. Try to include examples that clear up any potential ambiguity left from the metric description above. If possible, provide a range of examples that show both typical and atypical results, as well as examples where a variety of input parameters are passed.*
-## Limitations and Bias
-*Note any known limitations or biases that the metric has, with links and references if possible.*
 ## Citation
-*Cite the source where this metric was introduced.*
-## Further References
-*Add any useful further references.*

 tags:
 - evaluate
 - metric
+description: "This repo contains code of an automatic evaluation metric described in the paper
+Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation"
 sdk: gradio
 sdk_version: 3.0.2
 app_file: app.py
 # Metric Card for CTC_Eval
 ## Metric Description
+* Previous work on NLG evaluation has typically focused on a single task and developed individual evaluation metrics based on specific intuitions.
+* In this work, we propose a unifying perspective based on the nature of information change in NLG tasks, including compression (e.g., summarization), transduction (e.g., text rewriting), and creation (e.g., dialog).
+* A common concept underlying the three broad categories is information alignment, which we define as the extent to which the information in one generation component is grounded in another.
+* We adopt contextualized language models to measure information alignment.
 ## How to Use
+Example:
+```python
+    >>> ctc_score = evaluate.load("yzha/ctc_eval")
+    >>> results = ctc_score.compute(references=['hello world'], predictions='hi world')
+    >>> print(results)
+    {'ctc_score': 0.5211202502250671}
+```
 ### Inputs
+- **input_field**
+    - `references`: The document contains all the information
+    - `predictions`: NLG model generated text
 ### Output Values
+The CTC Score.
 ## Citation
+@inproceedings{deng2021compression,
+  title={Compression, Transduction, and Creation: A Unified Framework for Evaluating Natural Language Generation},
+  author={Deng, Mingkai and Tan, Bowen and Liu, Zhengzhong and Xing, Eric and Hu, Zhiting},
+  booktitle={Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing},
+  pages={7580--7605},
+  year={2021}
+}