llm-leaderboard / README.md

Commit History

Add HellaSwag Benchmark
e1aeb72

Ludwig Stumpp commited on

Align human eval format
5e1e4f6

Ludwig Stumpp commited on

Add BLOOM model
360209c

Ludwig Stumpp commited on

Add MMLU few shot
9c17477

Ludwig Stumpp commited on

Add galactica model
21aaac9

Ludwig Stumpp commited on

Rearrange and link to open-llms repo
a60d3ed

Ludwig Stumpp commited on

Align writing
a3504d1

Ludwig Stumpp commited on

Adding missing links to eval scores for MLU task
f7cfe3e

Ludwig Stumpp commited on

Add column for commercial use + logic in streamlit app + disclaimer
5323497

Ludwig Stumpp commited on

Adding MMLU dataset and removing source table
c0dd25e

Ludwig Stumpp commited on

Add aditional LAMBADA entries
53be3b4

Ludwig Stumpp commited on

Add missing model sizes
3a7dc42

Ludwig Stumpp commited on

Add codex model
f3f17e5

Ludwig Stumpp commited on

Add HumanEval and Starcoder
49b476f

Ludwig Stumpp commited on

Text work
617d84c

Ludwig Stumpp commited on

Remove links in table headers
f3a8621

Ludwig Stumpp commited on

Add links
1d376a9

Ludwig Stumpp commited on

Text work
2591e9a

Ludwig Stumpp commited on

Switch back to markdown as easier diffable
908b597

Ludwig Stumpp commited on

Move data files and specify as constant
48cd666

Ludwig Stumpp commited on

Add entry for sources
3175564

Ludwig Stumpp commited on

Add screenshot of streamlit app
d9de755

Ludwig Stumpp commited on

Update Readme
f1e50f4

Ludwig Stumpp commited on

Move from markdown table to csv table as easier to maintain for larger tables
24a15c0

Ludwig Stumpp commited on

Update
37bf1e8

Ludwig Stumpp commited on

First entries and streamlit app
697be1a

Ludwig Stumpp commited on

Add demo leaderboard for testing
bc01ae8

Ludwig Stumpp commited on

Initial commit
90c74c1
unverified

Ludwig Stumpp commited on