Evaluation datasets

community

AI & ML interests

None defined yet.

models

None public yet

datasets 62

lighteval/aimo_progress_prize_1

Viewer • Updated Apr 10 • 2

lighteval/mt-bench

Viewer • Updated Mar 19 • 15 • 1

lighteval/bbh

Updated Jan 31 • 8.08k • 1

lighteval/big_bench_hard

Viewer • Updated Oct 17, 2023 • 486 • 2

lighteval/MATH

Viewer • Updated Oct 17, 2023 • 18k • 23

lighteval/natural_questions_clean

Viewer • Updated Oct 17, 2023 • 173

lighteval/agi_eval_en

Updated Oct 17, 2023 • 6 • 1

lighteval/siqa

Viewer • Updated Oct 7, 2023 • 44k • 3

lighteval/trivia_qa

Viewer • Updated Oct 7, 2023 • 3

lighteval/mutual_harness

Viewer • Updated Aug 9, 2023 • 514 • 1