Model Leaderboards - a davidberenstein1957 Collection

davidberenstein1957 's Collections

Smol but mighty

LLM evals and benchmark datasets

Dataset Viber annotators

Cool and fun Spaces

Model Leaderboards

Useful datasets

Model Leaderboards

updated Jan 22

Running on CPU Upgrade

5.65k

5.65k

MTEB Leaderboard

🥇

Embedding Leaderboard
Running

366

366

Reward Bench Leaderboard

📐

Explore and analyze RewardBench leaderboard data
Running on CPU Upgrade

13.1k

13.1k

Open LLM Leaderboard

🏆

Track, rank and evaluate open LLMs and chatbots
Running

4.38k

4.38k

Chatbot Arena Leaderboard

🏆

Display chatbot performance leaderboard
Running

1.29k

1.29k

Big Code Models Leaderboard

📈

Submit code models for evaluation on benchmarks
Running

222

222

AI2 WildBench Leaderboard (V2)

🦁

Display and explore model leaderboards and chat history
Running on CPU Upgrade

753

753

Open VLM Leaderboard

🌎

VLMEvalKit Evaluation Results Collection
Running

206

206

BigCodeBench Leaderboard

🥇

Explore and analyze code evaluation data
Running

491

491

LLM-Perf Leaderboard

🏆

Explore LLM performance across hardware
Running

106

106

MTEB Arena

⚔

Display a machine translation evaluation interface