Spaces:
Running
Running
Commit
·
e137c10
1
Parent(s):
446174f
2024-03-08 15:42:11 Publish script update
Browse files- app_constants.py +1 -1
app_constants.py
CHANGED
@@ -5,7 +5,7 @@ This project compares different large language models and their providers for re
|
|
5 |
While other benchmarks compare LLMs on different human intelligence tasks this benchmark focus on features related to business and engineering aspects such as response times, pricing and data streaming capabilities.
|
6 |
|
7 |
To preform evaluation we chose a task of newspaper articles summarization from [GEM/xlsum](https://huggingface.co/datasets/GEM/xlsum) dataset as it represents a very standard type of task where model has to understand unstructured natural language text, process it and output text in a specified format.
|
8 |
-
For this version we chose English and Japanese languages, with Japanese representing languages using logographic alphabets. This enable us also validate the effectiveness of the LLM for different language groups.
|
9 |
|
10 |
Each of the models was asked to summarize the text using the following prompt:
|
11 |
|
|
|
5 |
While other benchmarks compare LLMs on different human intelligence tasks this benchmark focus on features related to business and engineering aspects such as response times, pricing and data streaming capabilities.
|
6 |
|
7 |
To preform evaluation we chose a task of newspaper articles summarization from [GEM/xlsum](https://huggingface.co/datasets/GEM/xlsum) dataset as it represents a very standard type of task where model has to understand unstructured natural language text, process it and output text in a specified format.
|
8 |
+
For this version we chose English, Ukrainian and Japanese languages, with Japanese representing languages using logographic alphabets. This enable us also validate the effectiveness of the LLM for different language groups.
|
9 |
|
10 |
Each of the models was asked to summarize the text using the following prompt:
|
11 |
|