leaderboard-pr-bot's picture
Adding Evaluation Results
dbe537e

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 53.39
ARC (25-shot) 60.49
HellaSwag (10-shot) 84.03
MMLU (5-shot) 57.83
TruthfulQA (0-shot) 54.52
Winogrande (5-shot) 75.77
GSM8K (5-shot) 2.96
DROP (3-shot) 38.12