Metric: coval
Update on GitHub


CoVal is a coreference evaluation tool for the CoNLL and ARRAU datasets which implements of the common evaluation metrics including MUC [Vilain et al, 1995], B-cubed [Bagga and Baldwin, 1998], CEAFe [Luo et al., 2005], LEA [Moosavi and Strube, 2016] and the averaged CoNLL score (the average of the F1 values of MUC, B-cubed and CEAFe) [Denis and Baldridge, 2009a; Pradhan et al., 2011]. This wrapper of CoVal currently only work with CoNLL line format: The CoNLL format has one word per line with all the annotation for this word in column separated by spaces: Column Type Description 1 Document ID This is a variation on the document filename 2 Part number Some files are divided into multiple parts numbered as 000, 001, 002, ... etc. 3 Word number 4 Word itself This is the token as segmented/tokenized in the Treebank. Initially the *_skel file contain the placeholder [WORD] which gets replaced by the actual token from the Treebank which is part of the OntoNotes release. 5 Part-of-Speech 6 Parse bit This is the bracketed structure broken before the first open parenthesis in the parse, and the word/part-of-speech leaf replaced with a *. The full parse can be created by substituting the asterix with the "([pos] [word])" string (or leaf) and concatenating the items in the rows of that column. 7 Predicate lemma The predicate lemma is mentioned for the rows for which we have semantic role information. All other rows are marked with a "-" 8 Predicate Frameset ID This is the PropBank frameset ID of the predicate in Column 7. 9 Word sense This is the word sense of the word in Column 3. 10 Speaker/Author This is the speaker or author name where available. Mostly in Broadcast Conversation and Web Log data. 11 Named Entities These columns identifies the spans representing various named entities. 12:N Predicate Arguments There is one column each of predicate argument structure information for the predicate mentioned in Column 7. N Coreference Coreference chain information encoded in a parenthesis structure. More informations on the format can be found here (section "*_conll File Format"): Details on the evaluation on CoNLL can be found here: CoVal code was written by @ns-moosavi. Some parts are borrowed from The test suite is taken from Mention evaluation and the test suite are added by @andreasvc. Parsing CoNLL files is developed by Leo Born.

How to load this metric directly with the datasets library:

from datasets import load_metric
metric = load_metric("coval")


    author = { Nafise Sadat Moosavi, Leo Born, Massimo Poesio and Michael Strube},
    title = {Using Automatically Extracted Minimum Spans to Disentangle Coreference Evaluation from Boundary Detection},
    year = {2019},
    booktitle = {Proceedings of the 57th Annual Meeting of
        the Association for Computational Linguistics (Volume 1: Long Papers)},
    publisher = {Association for Computational Linguistics},
    address = {Florence, Italy},

author = {Vilain, Marc and Burger, John and Aberdeen, John and Connolly, Dennis and Hirschman, Lynette},
title = {A Model-Theoretic Coreference Scoring Scheme},
year = {1995},
isbn = {1558604022},
publisher = {Association for Computational Linguistics},
address = {USA},
url = {},
doi = {10.3115/1072399.1072405},
booktitle = {Proceedings of the 6th Conference on Message Understanding},
pages = {45–52},
numpages = {8},
location = {Columbia, Maryland},
series = {MUC6 ’95}

    author = {Amit Bagga and Breck Baldwin},
    title = {Algorithms for Scoring Coreference Chains},
    booktitle = {In The First International Conference on Language Resources and Evaluation Workshop on Linguistics Coreference},
    year = {1998},
    pages = {563--566}

    author = {Xiaoqiang Luo},
    title = {On coreference resolution performance metrics},
    booktitle = {In Proc. of HLT/EMNLP},
    year = {2005},
    pages = {25--32},
    publisher = {URL}

    title = "Which Coreference Evaluation Metric Do You Trust? A Proposal for a Link-based Entity Aware Metric",
    author = "Moosavi, Nafise Sadat  and
      Strube, Michael",
    booktitle = "Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)",
    month = aug,
    year = "2016",
    address = "Berlin, Germany",
    publisher = "Association for Computational Linguistics",
    url = "",
    doi = "10.18653/v1/P16-1060",
    pages = "632--642",