|
TITLE = '<h1 align="center" id="space-title">African Languages LLM Eval Leaderboard</h1>' |
|
|
|
INTRO_TEXT = f""" |
|
## About |
|
This leaderboard tracks progress and ranks performance of large language models (LLMs) on African languages. |
|
This project uses the [lm-evaluation-harness by EleutherAI](https://github.com/EleutherAI/lm-evaluation-harness) for evaluation, focusing on African language tasks. Some of the tasks contained in this leaderboard have already been built into the harness (e.g. IrokoBench Tasks & Belebele). |
|
|
|
We currently evaluate models over the following benchmarks: |
|
|
|
- <a href="https://arxiv.org/abs/2406.03368" target="_blank"> Iroko Bench </a> |
|
- AfriMMLU (Direct and Translate) (0-shot) |
|
- AfriXNLI (Direct and Translate) (0-shot) |
|
|
|
- <a href="https://aclanthology.org/2024.acl-long.44" target="_blank"> BeleBele </a> |
|
- <a href="https://arxiv.org/abs/2305.06897" target="_blank"> AfriQa </a> |
|
""" |
|
|
|
HOW_TO = f""" |
|
## How to list your model performance on this leaderboard: |
|
Run the evaluation of your model using this repo: <a href="https://github.com/The-African-Research-Collective/jubilant-sniffle" target="_blank">https://github.com/The-African-Research-Collective/jubilant-sniffle</a>. |
|
And then, push the evaluation log and make a pull request. |
|
""" |
|
|
|
CREDIT = f""" |
|
## Credit |
|
To make this website, we use the following resources: |
|
- Datasets (IrokoBench, AfriQA, BeleBele) |
|
- Evaluation code (EleutherAI's lm_evaluation_harness repo) |
|
- Multilingual LLM Leaderboard code (UONLP's Open Multilingual LLM Evaluation Leaderboard) |
|
""" |
|
|
|
|
|
CITATION = f""" |
|
## Citation |
|
``` |
|
@misc{{tarescollmbenchmark, |
|
author = {{tbd...}}, |
|
title={{tbd...}}, |
|
year={{2024}} |
|
}} |
|
``` |
|
""" |