Ludwig Stumpp commited on
Commit
b199af5
1 Parent(s): b75e1d2

Add alpaca 7b model

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -12,6 +12,7 @@ https://huggingface.co/spaces/ludwigstumpp/llm-leaderboard
12
 
13
  | Model Name | Publisher | Open? | Chatbot Arena Elo | HellaSwag (few-shot) | HellaSwag (zero-shot) | HellaSwag (one-shot) | HumanEval-Python (pass@1) | LAMBADA (zero-shot) | LAMBADA (one-shot) | MMLU (zero-shot) | MMLU (few-shot) | TriviaQA (zero-shot) | TriviaQA (one-shot) | WinoGrande (zero-shot) | WinoGrande (one-shot) | WinoGrande (few-shot) |
14
  | ----------------------------------------------------------------------------------------------------------- | ------------------- | ----- | ------------------------------------------------ | -------------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | ------------------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | --------------------------------------------------------------- |
 
15
  | [alpaca-13b](https://crfm.stanford.edu/2023/03/13/alpaca.html) | Stanford | no | [1008](https://lmsys.org/blog/2023-05-03-arena/) | | | | | | | | | | | | | |
16
  | [bloom-176b](https://huggingface.co/bigscience/bloom) | BigScience | yes | | [0.744](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | | | [0.155](https://huggingface.co/bigscience/bloom#results) | | | [0.299](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | | | | | | |
17
  | [cerebras-gpt-7b](https://huggingface.co/cerebras/Cerebras-GPT-6.7B) | Cerebras | yes | | | [0.636](https://www.mosaicml.com/blog/mpt-7b) | | | [0.636](https://www.mosaicml.com/blog/mpt-7b) | | [0.259](https://www.mosaicml.com/blog/mpt-7b) | | [0.141](https://www.mosaicml.com/blog/mpt-7b) | | | | |
 
12
 
13
  | Model Name | Publisher | Open? | Chatbot Arena Elo | HellaSwag (few-shot) | HellaSwag (zero-shot) | HellaSwag (one-shot) | HumanEval-Python (pass@1) | LAMBADA (zero-shot) | LAMBADA (one-shot) | MMLU (zero-shot) | MMLU (few-shot) | TriviaQA (zero-shot) | TriviaQA (one-shot) | WinoGrande (zero-shot) | WinoGrande (one-shot) | WinoGrande (few-shot) |
14
  | ----------------------------------------------------------------------------------------------------------- | ------------------- | ----- | ------------------------------------------------ | -------------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | ------------------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------------------- | --------------------------------------------- | --------------------------------------------------------------- | ------------------------------------------------------------------ | --------------------------------------------------------------- | --------------------------------------------------------------- |
15
+ | [alpaca-7b](https://crfm.stanford.edu/2023/03/13/alpaca.html) | Stanford | no | | | [0.739](https://gpt4all.io/reports/GPT4All_Technical_Report_3.pdf) | | | | | | | | | [0.661](https://gpt4all.io/reports/GPT4All_Technical_Report_3.pdf) | | |
16
  | [alpaca-13b](https://crfm.stanford.edu/2023/03/13/alpaca.html) | Stanford | no | [1008](https://lmsys.org/blog/2023-05-03-arena/) | | | | | | | | | | | | | |
17
  | [bloom-176b](https://huggingface.co/bigscience/bloom) | BigScience | yes | | [0.744](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | | | [0.155](https://huggingface.co/bigscience/bloom#results) | | | [0.299](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | | | | | | |
18
  | [cerebras-gpt-7b](https://huggingface.co/cerebras/Cerebras-GPT-6.7B) | Cerebras | yes | | | [0.636](https://www.mosaicml.com/blog/mpt-7b) | | | [0.636](https://www.mosaicml.com/blog/mpt-7b) | | [0.259](https://www.mosaicml.com/blog/mpt-7b) | | [0.141](https://www.mosaicml.com/blog/mpt-7b) | | | | |