Spaces:

neurotechnology
/

open-lithuanian-llm-leaderboard

Running

App Files Files Community

artena commited on 13 days ago

Commit

e50dc1b

•

1 Parent(s): 6456225

Upload utils.py

Browse files

Files changed (1) hide show

utils.py +4 -10

utils.py CHANGED Viewed

@@ -115,7 +115,7 @@ table > tbody td:first-child {
 """
 LLM_BENCHMARKS_ABOUT_TEXT = f"""## Open Lithuanian LLM Leaderboard (v1.0.1)
-> The Open Lithuanian LLM Evaluation Leaderboard, developed by **Part DP AI** in collaboration with **Vilnius University NLP Lab**, provides a comprehensive benchmarking system specifically designed for Lithuanian LLMs. This leaderboard, based on the open-source [LM Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness), offers a unique platform for evaluating the performance of large language models (LLMs) on tasks that demand linguistic proficiency and technical skill in Lithuanian.
 > **Note:** This leaderboard is continuously updating its data and models, reflecting the latest developments in Lithuanian LLMs. It is currently in version 1.0.0, serving as the initial benchmark for Lithuanian LLM evaluation, with plans for future enhancements.
 ## 1. Key Features
 > 1. **Open Evaluation Access**
@@ -133,10 +133,10 @@ LLM_BENCHMARKS_ABOUT_TEXT = f"""## Open Lithuanian LLM Leaderboard (v1.0.1)
 >
 >    Each dataset is available in Lithuanian, providing a robust testing ground for models in a non-English setting. The datasets collectively contain over **40k samples** across various categories such as **Common Knowledge**, **Reasoning**, **Summarization**, **Math**, and **Specialized Examinations**, offering comprehensive coverage of diverse linguistic and technical challenges.
 >
-> 3. **Open-Source Dataset Sample**
->    A sample of the evaluation dataset is hosted on [Hugging Face Datasets](https://huggingface.co/datasets/PartAI/llm-leaderboard-datasets-sample), offering the AI community a glimpse of the benchmark content and format. This sample allows developers to pre-assess their models against representative data before a full leaderboard evaluation.
 >
-> 4. **Collaborative Development**
 >
 >    This leaderboard is developed by [**Neurotechnology**](https://huggingface.co/neurotechnology) and authored by [**Artūras Nakvosas**](https://huggingface.co/artena), leveraging cutting-edge industrial expertise to create a high-quality, open benchmarking tool. The project underscores a commitment to advancing Lithuanian LLMs through innovative solutions and fostering the growth of the local AI ecosystem.
 >
@@ -169,12 +169,6 @@ LLM_BENCHMARKS_SUBMIT_TEXT = """### Submitting a Model for Evaluation
 """
-PART_LOGO = """
-<img src="https://avatars.githubusercontent.com/u/39557177?v=4" style="width:30%;display:block;margin-left:auto;margin-right:auto">
-<h1 style="font-size: 28px; margin-bottom: 2px;">Part DP AI</h1>
-"""
 def load_jsonl(input_file):
     data = []
     with open(input_file, 'r') as f:

 """
 LLM_BENCHMARKS_ABOUT_TEXT = f"""## Open Lithuanian LLM Leaderboard (v1.0.1)
+> The Open Lithuanian LLM Evaluation Leaderboard, developed by **Neurotechnology** provides a comprehensive benchmarking system specifically designed for Lithuanian LLMs. This leaderboard, based on the open-source [LM Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness), offers a unique platform for evaluating the performance of large language models (LLMs) on tasks that demand linguistic proficiency and technical skill in Lithuanian.
 > **Note:** This leaderboard is continuously updating its data and models, reflecting the latest developments in Lithuanian LLMs. It is currently in version 1.0.0, serving as the initial benchmark for Lithuanian LLM evaluation, with plans for future enhancements.
 ## 1. Key Features
 > 1. **Open Evaluation Access**
 >
 >    Each dataset is available in Lithuanian, providing a robust testing ground for models in a non-English setting. The datasets collectively contain over **40k samples** across various categories such as **Common Knowledge**, **Reasoning**, **Summarization**, **Math**, and **Specialized Examinations**, offering comprehensive coverage of diverse linguistic and technical challenges.
 >
+> 3. **Open-Source Dataset Collection**
+>    A dataset collection of the evaluation is hosted on [Hugging Face Datasets](https://huggingface.co/collections/neurotechnology/lithuanian-evaluation-datasets-66c6da9991dced94646bfb30), offering the AI community a glimpse of the benchmark content and format. This sample allows developers to pre-assess their models against representative data before a full leaderboard evaluation.
 >
+> 4. **Development**
 >
 >    This leaderboard is developed by [**Neurotechnology**](https://huggingface.co/neurotechnology) and authored by [**Artūras Nakvosas**](https://huggingface.co/artena), leveraging cutting-edge industrial expertise to create a high-quality, open benchmarking tool. The project underscores a commitment to advancing Lithuanian LLMs through innovative solutions and fostering the growth of the local AI ecosystem.
 >
 """
 def load_jsonl(input_file):
     data = []
     with open(input_file, 'r') as f: