Spaces:
Runtime error
Runtime error
Upload content.py
Browse files- content.py +7 -32
content.py
CHANGED
@@ -1,57 +1,32 @@
|
|
1 |
-
TITLE = '<h1 align="center" id="space-title">Open Multilingual LLM Evaluation Leaderboard</h1>'
|
2 |
|
3 |
INTRO_TEXT = f"""
|
4 |
## About
|
5 |
-
|
6 |
This leaderboard tracks progress and ranks performance of large language models (LLMs) developed for different languages,
|
7 |
emphasizing on non-English languages to democratize benefits of LLMs to broader society.
|
8 |
-
Our current leaderboard provides evaluation data for
|
9 |
-
Arabic, Armenian, Basque, Bengali, Catalan, Chinese, Croatian, Danish, Dutch,
|
10 |
-
French, German, Gujarati, Hindi, Hungarian, Indonesian, Italian, Kannada, Malayalam,
|
11 |
-
Marathi, Nepali, Portuguese, Romanian, Russian, Serbian, Slovak, Spanish, Swedish,
|
12 |
-
Tamil, Telugu, Ukrainian, and Vietnamese, that will be expanded along the way.
|
13 |
-
Both multilingual and language-specific LLMs are welcome in this leaderboard.
|
14 |
-
We currently evaluate models over four benchmarks:
|
15 |
-
|
16 |
-
- <a href="https://arxiv.org/abs/1803.05457" target="_blank"> AI2 Reasoning Challenge </a> (25-shot)
|
17 |
-
- <a href="https://arxiv.org/abs/1905.07830" target="_blank"> HellaSwag </a> (0-shot)
|
18 |
-
- <a href="https://arxiv.org/abs/2009.03300" target="_blank"> MMLU </a> (25-shot)
|
19 |
-
- <a href="https://arxiv.org/abs/2109.07958" target="_blank"> TruthfulQA </a> (0-shot)
|
20 |
-
|
21 |
-
The evaluation data was translated into these languages using ChatGPT (gpt-35-turbo).
|
22 |
-
|
23 |
"""
|
24 |
|
25 |
HOW_TO = f"""
|
26 |
## How to list your model performance on this leaderboard:
|
27 |
-
|
28 |
-
Run the evaluation of your model using this repo: <a href="https://github.com/nlp-uoregon/mlmm-evaluation" target="_blank">https://github.com/nlp-uoregon/mlmm-evaluation</a>.
|
29 |
-
|
30 |
And then, push the evaluation log and make a pull request.
|
31 |
"""
|
32 |
|
33 |
CREDIT = f"""
|
34 |
## Credit
|
35 |
-
|
36 |
To make this website, we use the following resources:
|
37 |
-
|
38 |
-
- Datasets (AI2_ARC, HellaSwag, MMLU, TruthfulQA)
|
39 |
-
- Funding and GPU access (Adobe Research)
|
40 |
-
- Evaluation code (EleutherAI's lm_evaluation_harness repo)
|
41 |
- Leaderboard code (Huggingface4's open_llm_leaderboard repo)
|
42 |
-
|
43 |
"""
|
44 |
|
45 |
|
46 |
CITATION = f"""
|
47 |
## Citation
|
48 |
-
|
49 |
```
|
50 |
-
|
51 |
@misc{{lai2023openllmbenchmark,
|
52 |
-
author = {{Viet Lai and Nghia Trung Ngo and Amir Pouran Ben Veyseh and Franck Dernoncourt and Thien Huu Nguyen}},
|
53 |
-
title={{Open
|
54 |
-
year={{
|
55 |
}}
|
56 |
```
|
57 |
-
"""
|
|
|
1 |
+
TITLE = '<h1 align="center" id="space-title">Open Multilingual Basque LLM Evaluation Leaderboard</h1><img src="basque.JPG">'
|
2 |
|
3 |
INTRO_TEXT = f"""
|
4 |
## About
|
|
|
5 |
This leaderboard tracks progress and ranks performance of large language models (LLMs) developed for different languages,
|
6 |
emphasizing on non-English languages to democratize benefits of LLMs to broader society.
|
7 |
+
Our current leaderboard provides evaluation data for Basque.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
8 |
"""
|
9 |
|
10 |
HOW_TO = f"""
|
11 |
## How to list your model performance on this leaderboard:
|
12 |
+
Run the evaluation of your model using this repo: <a href="https://github.com/webdevserv/mlmm_basque_evaluation" target="_blank">mlmm_basque_evaluation</a>.
|
|
|
|
|
13 |
And then, push the evaluation log and make a pull request.
|
14 |
"""
|
15 |
|
16 |
CREDIT = f"""
|
17 |
## Credit
|
|
|
18 |
To make this website, we use the following resources:
|
|
|
|
|
|
|
|
|
19 |
- Leaderboard code (Huggingface4's open_llm_leaderboard repo)
|
|
|
20 |
"""
|
21 |
|
22 |
|
23 |
CITATION = f"""
|
24 |
## Citation
|
|
|
25 |
```
|
|
|
26 |
@misc{{lai2023openllmbenchmark,
|
27 |
+
author = {{Idoia Lertxundi, thanks to Viet Lai and Nghia Trung Ngo and Amir Pouran Ben Veyseh and Franck Dernoncourt and Thien Huu Nguyen}},
|
28 |
+
title={{Open Basque LLM Evaluation Leaderboard}},
|
29 |
+
year={{2024}}
|
30 |
}}
|
31 |
```
|
32 |
+
"""
|