Spaces:

bardsai
/

performance-llm-board

Running

piotr-szleg-bards-ai commited on Mar 8, 2024

Commit

e137c10

1 Parent(s): 446174f

2024-03-08 15:42:11 Publish script update

Files changed (1) hide show

app_constants.py CHANGED Viewed

@@ -5,7 +5,7 @@ This project compares different large language models and their providers for re
 While other benchmarks compare LLMs on different human intelligence tasks this benchmark focus on features related to business and engineering aspects such as response times, pricing and data streaming capabilities.
 To preform evaluation we chose a task of newspaper articles summarization from [GEM/xlsum](https://huggingface.co/datasets/GEM/xlsum) dataset as it represents a very standard type of task where model has to understand unstructured natural language text, process it and output text in a specified format.
-For this version we chose English and Japanese languages, with Japanese representing languages using logographic alphabets. This enable us also validate the effectiveness of the LLM for different language groups.
 Each of the models was asked to summarize the text using the following prompt:

 While other benchmarks compare LLMs on different human intelligence tasks this benchmark focus on features related to business and engineering aspects such as response times, pricing and data streaming capabilities.
 To preform evaluation we chose a task of newspaper articles summarization from [GEM/xlsum](https://huggingface.co/datasets/GEM/xlsum) dataset as it represents a very standard type of task where model has to understand unstructured natural language text, process it and output text in a specified format.
+For this version we chose English, Ukrainian and Japanese languages, with Japanese representing languages using logographic alphabets. This enable us also validate the effectiveness of the LLM for different language groups.
 Each of the models was asked to summarize the text using the following prompt: