Back to all metrics
Dataset: rouge πŸ“‰
Update on GitHub

How to load this metric directly with the πŸ€—/nlp library:

Copy to clipboard
from nlp import load_metric metric = load_metric("rouge")


ROUGE, or Recall-Oriented Understudy for Gisting Evaluation, is a set of metrics and a software package used for evaluating automatic summarization and machine translation software in natural language processing. The metrics compare an automatically produced summary or translation against a reference or a set of references (human-produced) summary or translation. This metrics is a wrapper around Google Research reimplementation of ROUGE:


    title = "{ROUGE}: A Package for Automatic Evaluation of Summaries",
    author = "Lin, Chin-Yew",
    booktitle = "Text Summarization Branches Out",
    month = jul,
    year = "2004",
    address = "Barcelona, Spain",
    publisher = "Association for Computational Linguistics",
    url = "",
    pages = "74--81",