I just noticed that multiple pull requests have been created for the same dataset and metrics.
Here is another example: https://huggingface.co/google/bigbird-pegasus-large-arxiv/discussions
Thanks for reporting this issue @grapplerulrich !
As far as I can tell, most of these duplicates occur when concurrent evaluation jobs are submitted from 1 or more users. That is, the same evaluation configuration is submitted before an existing job is completed. As you point out, this tends to mostly effect summarization evaluations, since these take quite a long time to complete.
We'll investigate whether it's possible to track concurrent jobs in the UI, and notify users accordingly.
In the meantime, I've manually merged / closed the appropriate PR on the repos you listed 🤗