Metric:
bleurt
Update on GitHub
Description
BLEURT a learnt evaluation metric for Natural Language Generation. It is built using multiple phases of transfer learning starting from a pretrained BERT model (Devlin et al. 2018) and then employing another pre-training phrase using synthetic data. Finally it is trained on WMT human annotations. You may run BLEURT out-of-the-box or fine-tune it for your specific application (the latter is expected to perform better). See the [README.md] file at https://github.com/google-research/bleurt for more information.
How to load this metric directly with the
datasets
library:
from datasets import load_metric metric = load_metric("bleurt")
Citation
@inproceedings{bleurt, title={BLEURT: Learning Robust Metrics for Text Generation}, author={Thibault Sellam and Dipanjan Das and Ankur P. Parikh}, booktitle={ACL}, year={2020}, url={https://arxiv.org/abs/2004.04696} }