Commits · ludwigstumpp/llm-leaderboard

Add text-davinci-003 results on HellaSwag and WinoGrande zero-shot

15b03fa

Ludwig Stumpp commited on May 18, 2023

Add koala results on HellaSwag and WinoGrande zero-shot

265c39e

Ludwig Stumpp commited on May 18, 2023

Add stablelm results on HellaSwag and WinoGrande zero-shot

a011af1

Ludwig Stumpp commited on May 18, 2023

Add oasst/pythia-12b HellaSwag and WinoGrande zero-shot results

12a4ec3

Ludwig Stumpp commited on May 18, 2023

Add Pythia models WinoGrande (zero shot)

a10f910

Ludwig Stumpp commited on May 18, 2023

Add alpaca 7b model

b199af5

Ludwig Stumpp commited on May 18, 2023

Add dolly-v2-12b results

b75e1d2

Ludwig Stumpp commited on May 18, 2023

Remove prompted StartCoder

205deb7

Ludwig Stumpp commited on May 16, 2023

Notes on definition of "open" model

2322286

Ludwig Stumpp commited on May 16, 2023

Remove codeT results for code-davinci-002 as not comparable to other HumanEval results, due to additional explicit testing of outputs

72edf21

Ludwig Stumpp commited on May 16, 2023

Add link to hf space

b7e4ee9

Ludwig Stumpp commited on May 11, 2023

Replace commercial column with open

1c52cdd

Ludwig Stumpp commited on May 11, 2023

Add WinoGrande zero-shot and results

f452fea

Ludwig Stumpp commited on May 11, 2023

Add WinoGrande few shot results for gpt4 and 3.5

eedd6a6

Ludwig Stumpp commited on May 11, 2023

Shown values in categorical filter now sorted

9770a07

Ludwig Stumpp commited on May 10, 2023

For now set PaLM2 commercial use to no until clear

f3fd684

Ludwig Stumpp commited on May 10, 2023

Starting to add PaLM2 benchmark results

fe8088e

Ludwig Stumpp commited on May 10, 2023

Add download button for table + update todos

ea40e33

Ludwig Stumpp commited on May 10, 2023

Add column for publisher

9d7638e

Ludwig Stumpp commited on May 10, 2023

Add further results from HELM

8f06941

Ludwig Stumpp commited on May 10, 2023

Add / modify gpt models according to HELM benchmark

4373b29

Ludwig Stumpp commited on May 10, 2023

Clarifying gpt model names

669c882

Ludwig Stumpp commited on May 10, 2023

Text work

84a7c6d

Ludwig Stumpp commited on May 8, 2023

Fix GPT -3 commercial use

4d54a13

Ludwig Stumpp commited on May 8, 2023

Add special thanks

d9a0906

Ludwig Stumpp commited on May 8, 2023

Add replit code

8c37256

Ludwig Stumpp commited on May 8, 2023

Add HellaSwag few shot

9e47a75

Ludwig Stumpp commited on May 8, 2023

Add llama results on hellaswag zero shot

6147ea1

Ludwig Stumpp commited on May 8, 2023

Add HellaSwag Benchmark

e1aeb72

Ludwig Stumpp commited on May 8, 2023

Align human eval format

5e1e4f6

Ludwig Stumpp commited on May 8, 2023

Add BLOOM model

360209c

Ludwig Stumpp commited on May 8, 2023

Add MMLU few shot

9c17477

Ludwig Stumpp commited on May 8, 2023

Add galactica model

21aaac9

Ludwig Stumpp commited on May 8, 2023

Rearrange and link to open-llms repo

a60d3ed

Ludwig Stumpp commited on May 8, 2023