Back to all metrics
Dataset: bleurt πŸ“‰
Update on GitHub

How to load this metric directly with the πŸ€—/nlp library:

			
Copy to clipboard
from nlp import load_metric metric = load_metric("bleurt")

Description

BLEURT a learnt evaluation metric for Natural Language Generation. It is built using multiple phases of transfer learning starting from a pretrained BERT model (Devlin et al. 2018) and then employing another pre-training phrase using synthetic data. Finally it is trained on WMT human annotations. You may run BLEURT out-of-the-box or fine-tune it for your specific application (the latter is expected to perform better). See the [README.md] file at https://github.com/google-research/bleurt for more information.

Citation

@inproceedings{bleurt,
  title={BLEURT: Learning Robust Metrics for Text Generation},
  author={Thibault Sellam and Dipanjan Das and Ankur P. Parikh},
  booktitle={ACL},
  year={2020},
  url={https://arxiv.org/abs/2004.04696}
}