Weyaxi's picture
Adding Evaluation Results (#1)
ab0984a

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 53.39
ARC (25-shot) 60.49
HellaSwag (10-shot) 84.03
MMLU (5-shot) 57.83
TruthfulQA (0-shot) 54.52
Winogrande (5-shot) 75.77
GSM8K (5-shot) 2.96
DROP (3-shot) 38.12