Spaces:
Runtime error
Runtime error
Ludwig Stumpp
commited on
Commit
•
b199af5
1
Parent(s):
b75e1d2
Add alpaca 7b model
Browse files
README.md
CHANGED
@@ -12,6 +12,7 @@ https://huggingface.co/spaces/ludwigstumpp/llm-leaderboard
|
|
12 |
|
13 |
| Model Name | Publisher | Open? | Chatbot Arena Elo | HellaSwag (few-shot) | HellaSwag (zero-shot) | HellaSwag (one-shot) | HumanEval-Python (pass@1) | LAMBADA (zero-shot) | LAMBADA (one-shot) | MMLU (zero-shot) | MMLU (few-shot) | TriviaQA (zero-shot) | TriviaQA (one-shot) | WinoGrande (zero-shot) | WinoGrande (one-shot) | WinoGrande (few-shot) |
|
14 |
| ----------------------------------------------------------------------------------------------------------- | ------------------- | ----- | ------------------------------------------------ | -------------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | ------------------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | --------------------------------------------------------------- |
|
|
|
15 |
| [alpaca-13b](https://crfm.stanford.edu/2023/03/13/alpaca.html) | Stanford | no | [1008](https://lmsys.org/blog/2023-05-03-arena/) | | | | | | | | | | | | | |
|
16 |
| [bloom-176b](https://huggingface.co/bigscience/bloom) | BigScience | yes | | [0.744](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | | | [0.155](https://huggingface.co/bigscience/bloom#results) | | | [0.299](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | | | | | | |
|
17 |
| [cerebras-gpt-7b](https://huggingface.co/cerebras/Cerebras-GPT-6.7B) | Cerebras | yes | | | [0.636](https://www.mosaicml.com/blog/mpt-7b) | | | [0.636](https://www.mosaicml.com/blog/mpt-7b) | | [0.259](https://www.mosaicml.com/blog/mpt-7b) | | [0.141](https://www.mosaicml.com/blog/mpt-7b) | | | | |
|
|
|
12 |
|
13 |
| Model Name | Publisher | Open? | Chatbot Arena Elo | HellaSwag (few-shot) | HellaSwag (zero-shot) | HellaSwag (one-shot) | HumanEval-Python (pass@1) | LAMBADA (zero-shot) | LAMBADA (one-shot) | MMLU (zero-shot) | MMLU (few-shot) | TriviaQA (zero-shot) | TriviaQA (one-shot) | WinoGrande (zero-shot) | WinoGrande (one-shot) | WinoGrande (few-shot) |
|
14 |
| ----------------------------------------------------------------------------------------------------------- | ------------------- | ----- | ------------------------------------------------ | -------------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | ------------------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | --------------------------------------------------------------- |
|
15 |
+
| [alpaca-7b](https://crfm.stanford.edu/2023/03/13/alpaca.html) | Stanford | no | | | [0.739](https://gpt4all.io/reports/GPT4All_Technical_Report_3.pdf) | | | | | | | | | [0.661](https://gpt4all.io/reports/GPT4All_Technical_Report_3.pdf) | | |
|
16 |
| [alpaca-13b](https://crfm.stanford.edu/2023/03/13/alpaca.html) | Stanford | no | [1008](https://lmsys.org/blog/2023-05-03-arena/) | | | | | | | | | | | | | |
|
17 |
| [bloom-176b](https://huggingface.co/bigscience/bloom) | BigScience | yes | | [0.744](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | | | [0.155](https://huggingface.co/bigscience/bloom#results) | | | [0.299](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | | | | | | |
|
18 |
| [cerebras-gpt-7b](https://huggingface.co/cerebras/Cerebras-GPT-6.7B) | Cerebras | yes | | | [0.636](https://www.mosaicml.com/blog/mpt-7b) | | | [0.636](https://www.mosaicml.com/blog/mpt-7b) | | [0.259](https://www.mosaicml.com/blog/mpt-7b) | | [0.141](https://www.mosaicml.com/blog/mpt-7b) | | | | |
|