Spaces:

uonlp
/

open_multilingual_llm_leaderboard

Running

open_multilingual_llm_leaderboard / content.py

Add search capability and language names

8c2ee0f about 1 year ago

1.79 kB

	TITLE = '<h1 align="center" id="space-title">Open Multilingual LLM Evaluation Leaderboard</h1>'

	INTRO_TEXT = f"""
	## About

	This leaderboard shows the performance of pretrained models in 29 languages including Arabic, Armenian, Basque, Bengali, Catalan, Chinese, Croatian, Danish, Dutch, French, German, Gujarati, Hindi, Hungarian, Indonesian, Italian, Kannada, Malayalam, Marathi, Nepali, Portuguese, Romanian, Russian, Serbian, Slovak, Spanish, Swedish, Tamil, Telugu, Ukrainian, and Vietnameseon four benchmarks:

	- <a href="https://arxiv.org/abs/1803.05457" target="_blank"> AI2 Reasoning Challenge </a> (25-shot)
	- <a href="https://arxiv.org/abs/1905.07830" target="_blank"> HellaSwag </a> (10-shot)
	- <a href="https://arxiv.org/abs/2009.03300" target="_blank"> MMLU </a> (5-shot)
	- <a href="https://arxiv.org/abs/2109.07958" target="_blank"> TruthfulQA </a> (0-shot)

	The evaluation data was translated into 29 languages using ChatGPT.

	"""

	HOW_TO = f"""
	## How to list your model performance on this leaderboard:

	Send an email with title [Open mLLM Loaderboard] to vietl@uoregon.edu with the huggingface's model name.

	We will run your model on the four benchmarks and add it to the leaderboard.
	"""

	CREDIT = f"""
	## Credit

	To make this website, we use the following resources:

	- Datasets (AI2_ARC, HellaSwag, MMLU, TruthfulQA)
	- Funding and GPU access (Adobe Research)
	- Evaluation code (EleutherAI's lm_evaluation_harness repo)
	- Leaderboard code (Huggingface4's open_llm_leaderboard repo)

	"""


	CITATION = f"""
	## Citation

	```

	@misc{{lai2023openllmbenchmark,
	author = {{Viet Lai and Nghia Trung Ngo and Amir Pouran Ben Veyseh and Franck Dernoncourt and Thien Huu Nguyen}},
	title={{Open Multilingual LLM Evaluation Leaderboard}},
	year={{2023}}
	}}
	```
	"""