lightblue
/

suzume-llama-3-8B-multilingual

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

ptrdvn commited on Apr 25, 2024

Commit

06e7cf7

·

verified ·

1 Parent(s): 65972c2

Update README.md

Files changed (1) hide show

README.md +2 -0

README.md CHANGED Viewed

@@ -62,11 +62,13 @@ We achieve the following MT-Bench scores across 6 languages:
 | **Russian** 🇷🇺  | NaN                                     | 8.19                                         | 8.28                              | 7.94              |
 | **Chinese** 🇨🇳  | NaN                                     | 7.11                                         | 6.97                              | 7.55              |
 | **English** 🇺🇸  | 7.98                                    | 7.73                                         | 7.92                              | 8.26              |
 We observe minimal degredation of Llama 3's English ability while achieving best-in-class multilingual abilities compared to the top rated 7B model ([Nexusflow/Starling-LM-7B-beta](https://huggingface.co/Nexusflow/Starling-LM-7B-beta)) on the [Chatbot Arena Leaderboard](https://chat.lmsys.org/?leaderboard).
 [Here is our evaluation script.](https://drive.google.com/file/d/15HPn7452t8LbTD9HKSl7ngYYWnsoOG08/view?usp=sharing)
 # Training data
 We train on three sources of data to create this model:

 | **Russian** 🇷🇺  | NaN                                     | 8.19                                         | 8.28                              | 7.94              |
 | **Chinese** 🇨🇳  | NaN                                     | 7.11                                         | 6.97                              | 7.55              |
 | **English** 🇺🇸  | 7.98                                    | 7.73                                         | 7.92                              | 8.26              |
+(Note the Russian scores exclude code, reasoning and math problems due to not having any translated reference answers for these questions.)
 We observe minimal degredation of Llama 3's English ability while achieving best-in-class multilingual abilities compared to the top rated 7B model ([Nexusflow/Starling-LM-7B-beta](https://huggingface.co/Nexusflow/Starling-LM-7B-beta)) on the [Chatbot Arena Leaderboard](https://chat.lmsys.org/?leaderboard).
 [Here is our evaluation script.](https://drive.google.com/file/d/15HPn7452t8LbTD9HKSl7ngYYWnsoOG08/view?usp=sharing)
 # Training data
 We train on three sources of data to create this model: