An automatic evaluation tool for LLMs.
LMArena
community
AI & ML interests
None defined yet.
Recent Activity
Organization Card
LMArena is an open platform for crowdsourced AI benchmarking, originally created by researchers from UC Berkeley SkyLab.
We have officially graduated from LMSYS.org!
Free chat with the best AI models at lmarena.ai, and see rankings at lmarena.ai/leaderboard.
Collections
2
spaces
7
Running
4.31k
Chatbot Arena Leaderboard
🏆
Display chatbot leaderboard and statistics
Running
Arena Hard Viewer
⚡
Browse and evaluate model judgments from benchmarks
Running
27
Llama-4-Maverick-03-26-Experimental Battles
🔥
Browse and compare model conversation outcomes
Running
9
Category Arena Example
📚
Browse chatbot responses to compare models
Running
6
Preference Proxy Evaluations
🦀
Preference Proxy Evaluations
Running
189
Chatbot Arena
💬
Initiate conversations with multiple chatbots
models
20
lmarena-ai/p2l-7b-grk-01112025
Updated
•
20
•
3
lmarena-ai/p2l-7b-grk-02222025
Updated
•
316
•
6
lmarena-ai/p2l-0.5b-bt-01132025
Updated
•
11
lmarena-ai/p2l-1.5b-bt-01132025
Updated
•
8
lmarena-ai/p2l-3b-bt-01132025
Updated
•
8
lmarena-ai/p2l-7b-bt-01132025
Updated
•
139
•
2
lmarena-ai/p2l-135m-bt-01132025
Updated
•
14
lmarena-ai/p2l-360m-bt-01132025
Updated
•
8
lmarena-ai/p2l-135m-rk-01132025
Updated
•
5
lmarena-ai/p2l-360m-rk-01132025
Updated
•
10
datasets
20
lmarena-ai/arena-hard-auto
Updated
•
123
lmarena-ai/search-arena-v1-7k
Viewer
•
Updated
•
7k
•
576
•
12
lmarena-ai/webdev-arena-preference-10k
Viewer
•
Updated
•
10.5k
•
185
•
5
lmarena-ai/repochat-arena-preference-4k
Viewer
•
Updated
•
3.84k
•
71
•
3
lmarena-ai/arena-human-preference-100k
Viewer
•
Updated
•
106k
•
740
•
38
lmarena-ai/VisionArena-Chat
Viewer
•
Updated
•
199k
•
2.45k
•
2
lmarena-ai/VisionArena-Battle
Viewer
•
Updated
•
29.8k
•
149
•
5
lmarena-ai/categories-benchmark-eval
Preview
•
Updated
•
17
•
3
lmarena-ai/vision-arena-bench-v0.1
Viewer
•
Updated
•
500
•
1.64k
•
1
lmarena-ai/Llama-3-70b-battles
Viewer
•
Updated
•
1.6k
•
63
•
3