Open LLM Leaderboard
Track, rank and evaluate open LLMs and chatbots
Track, rank and evaluate open LLMs and chatbots
Select and filter benchmarks for text embedding tasks
Submit code models for evaluation on benchmarks
Display UGI leaderboard data in an interactive grid
Request evaluation results for a speech model
VLMEvalKit Evaluation Results Collection
Explore and filter language model benchmark results
Explore hardware performance for language models
Generate images from text descriptions
Browse and submit LLM evaluations
View LLM Performance Leaderboard
Submit and evaluate models on a leaderboard
Run a Streamlit web app
Explore and analyze code evaluation data
Upload and evaluate video models
Track, rank and evaluate open LLMs and chatbots
Track, rank and evaluate open LLMs in Portuguese
Request model evaluation on COCO val 2017 dataset
View and submit LLM evaluations
Track, rank and evaluate open Arabic LLMs and chatbots
Display OCRBench leaderboard for model evaluations