Possibly include multi-lingual benchmarks like C-Eval and XCopa

#29
by yaofu - opened

This echo to the discussions in
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/24
about adding multilingual evaluation.

Would like to recommend C-Eval, which is a good Chinese knowledge evaluation suite similar to MMLU
https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard/discussions/24

as well as Xcopa, a good multilingual commonsense reasoning benchmark used in PaLM2 eval
https://github.com/cambridgeltl/xcopa

Would be awesome if open-llm-leaderboard could include these datasets!

This comment has been hidden

@clefourrier Will you add this or can this be close?

Hugging Face H4 org

Hi!
We won't add new multilingual evals to the Open LLM Leaderboard, but anyone wanting to start a multilingual leaderboard can ping me in this thread if needed. I'll close in the meantime.

clefourrier changed discussion status to closed

Sign up or log in to comment