Spaces:

ludwigstumpp
/

llm-leaderboard

Runtime error

App Files Files Community

Ludwig Stumpp commited on May 18, 2023

Commit

b199af5

•

1 Parent(s): b75e1d2

Add alpaca 7b model

Browse files

Files changed (1) hide show

README.md +1 -0

README.md CHANGED Viewed

@@ -12,6 +12,7 @@ https://huggingface.co/spaces/ludwigstumpp/llm-leaderboard
 | Model Name                                                                                                  | Publisher           | Open? | Chatbot Arena Elo                                | HellaSwag (few-shot)                                                 | HellaSwag (zero-shot)                                              | HellaSwag (one-shot)                                            | HumanEval-Python (pass@1)                                                       | LAMBADA (zero-shot)                           | LAMBADA (one-shot)                                              | MMLU (zero-shot)                                                                         | MMLU (few-shot)                                                      | TriviaQA (zero-shot)                          | TriviaQA (one-shot)                                             | WinoGrande (zero-shot)                                             | WinoGrande (one-shot)                                           | WinoGrande (few-shot)                                           |
 | ----------------------------------------------------------------------------------------------------------- | ------------------- | ----- | ------------------------------------------------ | -------------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | ------------------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | --------------------------------------------------------------- |
 | [alpaca-13b](https://crfm.stanford.edu/2023/03/13/alpaca.html)                                              | Stanford            | no    | [1008](https://lmsys.org/blog/2023-05-03-arena/) |                                                                      |                                                                    |                                                                 |                                                                                 |                                               |                                                                 |                                                                                          |                                                                      |                                               |                                                                 |                                                                    |                                                                 |                                                                 |
 | [bloom-176b](https://huggingface.co/bigscience/bloom)                                                       | BigScience          | yes   |                                                  | [0.744](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) |                                                                    |                                                                 | [0.155](https://huggingface.co/bigscience/bloom#results)                        |                                               |                                                                 | [0.299](https://crfm.stanford.edu/helm/latest/?group=core_scenarios)                     |                                                                      |                                               |                                                                 |                                                                    |                                                                 |                                                                 |
 | [cerebras-gpt-7b](https://huggingface.co/cerebras/Cerebras-GPT-6.7B)                                        | Cerebras            | yes   |                                                  |                                                                      | [0.636](https://www.mosaicml.com/blog/mpt-7b)                      |                                                                 |                                                                                 | [0.636](https://www.mosaicml.com/blog/mpt-7b) |                                                                 | [0.259](https://www.mosaicml.com/blog/mpt-7b)                                            |                                                                      | [0.141](https://www.mosaicml.com/blog/mpt-7b) |                                                                 |                                                                    |                                                                 |                                                                 |

 | Model Name                                                                                                  | Publisher           | Open? | Chatbot Arena Elo                                | HellaSwag (few-shot)                                                 | HellaSwag (zero-shot)                                              | HellaSwag (one-shot)                                            | HumanEval-Python (pass@1)                                                       | LAMBADA (zero-shot)                           | LAMBADA (one-shot)                                              | MMLU (zero-shot)                                                                         | MMLU (few-shot)                                                      | TriviaQA (zero-shot)                          | TriviaQA (one-shot)                                             | WinoGrande (zero-shot)                                             | WinoGrande (one-shot)                                           | WinoGrande (few-shot)                                           |
 | ----------------------------------------------------------------------------------------------------------- | ------------------- | ----- | ------------------------------------------------ | -------------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | ------------------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | --------------------------------------------------------------- |
+| [alpaca-7b](https://crfm.stanford.edu/2023/03/13/alpaca.html)                                               | Stanford            | no    |                                                  |                                                                      | [0.739](https://gpt4all.io/reports/GPT4All_Technical_Report_3.pdf) |                                                                 |                                                                                 |                                               |                                                                 |                                                                                          |                                                                      |                                               |                                                                 | [0.661](https://gpt4all.io/reports/GPT4All_Technical_Report_3.pdf) |                                                                 |                                                                 |
 | [alpaca-13b](https://crfm.stanford.edu/2023/03/13/alpaca.html)                                              | Stanford            | no    | [1008](https://lmsys.org/blog/2023-05-03-arena/) |                                                                      |                                                                    |                                                                 |                                                                                 |                                               |                                                                 |                                                                                          |                                                                      |                                               |                                                                 |                                                                    |                                                                 |                                                                 |
 | [bloom-176b](https://huggingface.co/bigscience/bloom)                                                       | BigScience          | yes   |                                                  | [0.744](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) |                                                                    |                                                                 | [0.155](https://huggingface.co/bigscience/bloom#results)                        |                                               |                                                                 | [0.299](https://crfm.stanford.edu/helm/latest/?group=core_scenarios)                     |                                                                      |                                               |                                                                 |                                                                    |                                                                 |                                                                 |
 | [cerebras-gpt-7b](https://huggingface.co/cerebras/Cerebras-GPT-6.7B)                                        | Cerebras            | yes   |                                                  |                                                                      | [0.636](https://www.mosaicml.com/blog/mpt-7b)                      |                                                                 |                                                                                 | [0.636](https://www.mosaicml.com/blog/mpt-7b) |                                                                 | [0.259](https://www.mosaicml.com/blog/mpt-7b)                                            |                                                                      | [0.141](https://www.mosaicml.com/blog/mpt-7b) |                                                                 |                                                                    |                                                                 |                                                                 |