Need for benchmark dataset for each language

#1
by AhmadHakami - opened

I would be grateful if you could add a benchmark dataset for each language. This would be more helpful in evaluating the results

I appreciate your suggestion to include benchmark datasets for each language. Currently, the leaderboard primarily focuses on English speech recognition. However, I want to assure you that there are plans to expand its scope to encompass multilingual evaluation in future versions.

The leaderboard states:

The leaderboard currently focuses on English speech recognition, and will be expanded to multilingual evaluation in later versions.

As the leaderboard evolves, incorporating benchmark datasets for various languages will indeed enhance its utility in evaluating results across a broader linguistic spectrum. Your input is valuable, and we look forward to future iterations that will provide a more comprehensive assessment of ASR models across multiple languages.

Hugging Face for Audio org

Hi @AhmadHakami - This is a brilliant suggestion! We are working towards it (although quite slowly). We want to make this happen soon.
A leaderboard's biggest hurdle is finding a bouquet of evaluation datasets. Especially those that a given model is not trained on.

Do you have any suggestions on possible datasets that are generalised across languages? ๐Ÿค—

We previously had leaderboard for other languages. Where is it now? @reach-vb

Hugging Face for Audio org

Hi @RASMUS - Sorry for the delayed response here. The previous leaderboard just evaluated on CV split. Which is kind of pointless here since majority of multi-lingual models are trained on the same CV splits. Common Voice also shuffles the dataset from one release to the other, hence there's quite significant test data leakage.

Still waiting for multi-lingual benchmark...

Sign up or log in to comment