Ludwig Stumpp commited on
Commit
265c39e
1 Parent(s): a011af1

Add koala results on HellaSwag and WinoGrande zero-shot

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -35,7 +35,7 @@ https://huggingface.co/spaces/ludwigstumpp/llm-leaderboard
35
  | [gpt-4](https://arxiv.org/abs/2303.08774v3) | OpenAI | no | | [0.953](https://arxiv.org/abs/2303.08774v3) | | | [0.670](https://arxiv.org/abs/2303.08774v3) | | | | [0.864](https://arxiv.org/abs/2303.08774v3) | | | | | [0.875](https://arxiv.org/abs/2303.08774v3) |
36
  | [gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) | EleutherAI | yes | | [0.718](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.719](https://www.mosaicml.com/blog/mpt-7b) | | | [0.719](https://www.mosaicml.com/blog/mpt-7b) | | [0.269](https://www.mosaicml.com/blog/mpt-7b) | [0.276](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.347](https://www.mosaicml.com/blog/mpt-7b) | | | | |
37
  | [gpt-j-6b](https://huggingface.co/EleutherAI/gpt-j-6b) | EleutherAI | yes | | [0.663](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.683](https://www.mosaicml.com/blog/mpt-7b) | | | [0.683](https://www.mosaicml.com/blog/mpt-7b) | | [0.261](https://www.mosaicml.com/blog/mpt-7b) | [0.249](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.234](https://www.mosaicml.com/blog/mpt-7b) | | | | |
38
- | [koala-13b](https://bair.berkeley.edu/blog/2023/04/03/koala/) | Berkeley BAIR | no | [1082](https://lmsys.org/blog/2023-05-03-arena/) | | | | | | | | | | | | | |
39
  | [llama-7b](https://arxiv.org/abs/2302.13971) | Meta AI | no | | | [0.738](https://www.mosaicml.com/blog/mpt-7b) | | [0.105](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | [0.738](https://www.mosaicml.com/blog/mpt-7b) | | [0.302](https://www.mosaicml.com/blog/mpt-7b) | | [0.443](https://www.mosaicml.com/blog/mpt-7b) | | [0.701](https://arxiv.org/abs/2302.13971v1) | | |
40
  | [llama-13b](https://arxiv.org/abs/2302.13971) | Meta AI | no | [932](https://lmsys.org/blog/2023-05-03-arena/) | | [0.792](https://arxiv.org/abs/2302.13971) | | [0.158](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | [0.730](https://arxiv.org/abs/2302.13971v1) | | |
41
  | [llama-33b](https://arxiv.org/abs/2302.13971) | Meta AI | no | | | [0.828](https://arxiv.org/abs/2302.13971) | | [0.217](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | [0.760](https://arxiv.org/abs/2302.13971v1) | | |
 
35
  | [gpt-4](https://arxiv.org/abs/2303.08774v3) | OpenAI | no | | [0.953](https://arxiv.org/abs/2303.08774v3) | | | [0.670](https://arxiv.org/abs/2303.08774v3) | | | | [0.864](https://arxiv.org/abs/2303.08774v3) | | | | | [0.875](https://arxiv.org/abs/2303.08774v3) |
36
  | [gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) | EleutherAI | yes | | [0.718](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.719](https://www.mosaicml.com/blog/mpt-7b) | | | [0.719](https://www.mosaicml.com/blog/mpt-7b) | | [0.269](https://www.mosaicml.com/blog/mpt-7b) | [0.276](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.347](https://www.mosaicml.com/blog/mpt-7b) | | | | |
37
  | [gpt-j-6b](https://huggingface.co/EleutherAI/gpt-j-6b) | EleutherAI | yes | | [0.663](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.683](https://www.mosaicml.com/blog/mpt-7b) | | | [0.683](https://www.mosaicml.com/blog/mpt-7b) | | [0.261](https://www.mosaicml.com/blog/mpt-7b) | [0.249](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.234](https://www.mosaicml.com/blog/mpt-7b) | | | | |
38
+ | [koala-13b](https://bair.berkeley.edu/blog/2023/04/03/koala/) | Berkeley BAIR | no | [1082](https://lmsys.org/blog/2023-05-03-arena/) | | [0.726](https://gpt4all.io/reports/GPT4All_Technical_Report_3.pdf) | | | | | | | | | [0.688](https://gpt4all.io/reports/GPT4All_Technical_Report_3.pdf) | | |
39
  | [llama-7b](https://arxiv.org/abs/2302.13971) | Meta AI | no | | | [0.738](https://www.mosaicml.com/blog/mpt-7b) | | [0.105](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | [0.738](https://www.mosaicml.com/blog/mpt-7b) | | [0.302](https://www.mosaicml.com/blog/mpt-7b) | | [0.443](https://www.mosaicml.com/blog/mpt-7b) | | [0.701](https://arxiv.org/abs/2302.13971v1) | | |
40
  | [llama-13b](https://arxiv.org/abs/2302.13971) | Meta AI | no | [932](https://lmsys.org/blog/2023-05-03-arena/) | | [0.792](https://arxiv.org/abs/2302.13971) | | [0.158](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | [0.730](https://arxiv.org/abs/2302.13971v1) | | |
41
  | [llama-33b](https://arxiv.org/abs/2302.13971) | Meta AI | no | | | [0.828](https://arxiv.org/abs/2302.13971) | | [0.217](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | [0.760](https://arxiv.org/abs/2302.13971v1) | | |