Pre-trained evaluator in EMNLP 2022 paper
Towards a Unified Multi-Dimensional Evaluator for Text Generation
Introduction
Multi-dimensional evaluation is the dominant paradigm for human evaluation in Natural Language Generation (NLG), i.e., evaluating the generated text from multiple explainable dimensions, such as coherence and fluency.
However, automatic evaluation in NLG is still dominated by similarity-based metrics (e.g., ROUGE, BLEU), but they are not sufficient to portray the difference between the advanced generation models.
Therefore, we propose UniEval to bridge this gap so that a more comprehensive and fine-grained evaluation of NLG systems can be achieved.
Pre-trained Evaluator
unieval-dialog is the pre-trained evaluator for the dialogue response generation task. It can evaluate the model output from five dimensions:
- naturalness
- coherence
- engagingness
- groundedness
- understandability
Usage
Please refer to our GitHub repository.
- Downloads last month
- 8,809