artena commited on
Commit
e50dc1b
1 Parent(s): 6456225

Upload utils.py

Browse files
Files changed (1) hide show
  1. utils.py +4 -10
utils.py CHANGED
@@ -115,7 +115,7 @@ table > tbody td:first-child {
115
  """
116
 
117
  LLM_BENCHMARKS_ABOUT_TEXT = f"""## Open Lithuanian LLM Leaderboard (v1.0.1)
118
- > The Open Lithuanian LLM Evaluation Leaderboard, developed by **Part DP AI** in collaboration with **Vilnius University NLP Lab**, provides a comprehensive benchmarking system specifically designed for Lithuanian LLMs. This leaderboard, based on the open-source [LM Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness), offers a unique platform for evaluating the performance of large language models (LLMs) on tasks that demand linguistic proficiency and technical skill in Lithuanian.
119
  > **Note:** This leaderboard is continuously updating its data and models, reflecting the latest developments in Lithuanian LLMs. It is currently in version 1.0.0, serving as the initial benchmark for Lithuanian LLM evaluation, with plans for future enhancements.
120
  ## 1. Key Features
121
  > 1. **Open Evaluation Access**
@@ -133,10 +133,10 @@ LLM_BENCHMARKS_ABOUT_TEXT = f"""## Open Lithuanian LLM Leaderboard (v1.0.1)
133
  >
134
  > Each dataset is available in Lithuanian, providing a robust testing ground for models in a non-English setting. The datasets collectively contain over **40k samples** across various categories such as **Common Knowledge**, **Reasoning**, **Summarization**, **Math**, and **Specialized Examinations**, offering comprehensive coverage of diverse linguistic and technical challenges.
135
  >
136
- > 3. **Open-Source Dataset Sample**
137
- > A sample of the evaluation dataset is hosted on [Hugging Face Datasets](https://huggingface.co/datasets/PartAI/llm-leaderboard-datasets-sample), offering the AI community a glimpse of the benchmark content and format. This sample allows developers to pre-assess their models against representative data before a full leaderboard evaluation.
138
  >
139
- > 4. **Collaborative Development**
140
  >
141
  > This leaderboard is developed by [**Neurotechnology**](https://huggingface.co/neurotechnology) and authored by [**Artūras Nakvosas**](https://huggingface.co/artena), leveraging cutting-edge industrial expertise to create a high-quality, open benchmarking tool. The project underscores a commitment to advancing Lithuanian LLMs through innovative solutions and fostering the growth of the local AI ecosystem.
142
  >
@@ -169,12 +169,6 @@ LLM_BENCHMARKS_SUBMIT_TEXT = """### Submitting a Model for Evaluation
169
  """
170
 
171
 
172
- PART_LOGO = """
173
- <img src="https://avatars.githubusercontent.com/u/39557177?v=4" style="width:30%;display:block;margin-left:auto;margin-right:auto">
174
- <h1 style="font-size: 28px; margin-bottom: 2px;">Part DP AI</h1>
175
- """
176
-
177
-
178
  def load_jsonl(input_file):
179
  data = []
180
  with open(input_file, 'r') as f:
 
115
  """
116
 
117
  LLM_BENCHMARKS_ABOUT_TEXT = f"""## Open Lithuanian LLM Leaderboard (v1.0.1)
118
+ > The Open Lithuanian LLM Evaluation Leaderboard, developed by **Neurotechnology** provides a comprehensive benchmarking system specifically designed for Lithuanian LLMs. This leaderboard, based on the open-source [LM Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness), offers a unique platform for evaluating the performance of large language models (LLMs) on tasks that demand linguistic proficiency and technical skill in Lithuanian.
119
  > **Note:** This leaderboard is continuously updating its data and models, reflecting the latest developments in Lithuanian LLMs. It is currently in version 1.0.0, serving as the initial benchmark for Lithuanian LLM evaluation, with plans for future enhancements.
120
  ## 1. Key Features
121
  > 1. **Open Evaluation Access**
 
133
  >
134
  > Each dataset is available in Lithuanian, providing a robust testing ground for models in a non-English setting. The datasets collectively contain over **40k samples** across various categories such as **Common Knowledge**, **Reasoning**, **Summarization**, **Math**, and **Specialized Examinations**, offering comprehensive coverage of diverse linguistic and technical challenges.
135
  >
136
+ > 3. **Open-Source Dataset Collection**
137
+ > A dataset collection of the evaluation is hosted on [Hugging Face Datasets](https://huggingface.co/collections/neurotechnology/lithuanian-evaluation-datasets-66c6da9991dced94646bfb30), offering the AI community a glimpse of the benchmark content and format. This sample allows developers to pre-assess their models against representative data before a full leaderboard evaluation.
138
  >
139
+ > 4. **Development**
140
  >
141
  > This leaderboard is developed by [**Neurotechnology**](https://huggingface.co/neurotechnology) and authored by [**Artūras Nakvosas**](https://huggingface.co/artena), leveraging cutting-edge industrial expertise to create a high-quality, open benchmarking tool. The project underscores a commitment to advancing Lithuanian LLMs through innovative solutions and fostering the growth of the local AI ecosystem.
142
  >
 
169
  """
170
 
171
 
 
 
 
 
 
 
172
  def load_jsonl(input_file):
173
  data = []
174
  with open(input_file, 'r') as f: