RAFT: A Real-World Few-Shot Text Classification Benchmark Paper • 2109.14076 • Published Sep 28, 2021 • 2
GEMv2: Multilingual NLG Benchmarking in a Single Line of Code Paper • 2206.11249 • Published Jun 22, 2022
AfroDigits: A Community-Driven Spoken Digit Dataset for African Languages Paper • 2303.12582 • Published Mar 22, 2023 • 20
Evaluate & Evaluation on the Hub: Better Best Practices for Data and Model Measurements Paper • 2210.01970 • Published Sep 30, 2022 • 11