dhmeltzer's picture
Adding Evaluation Results (#1)
5ff3ddb

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 42.74
ARC (25-shot) 54.35
HellaSwag (10-shot) 78.06
MMLU (5-shot) 45.35
TruthfulQA (0-shot) 37.11
Winogrande (5-shot) 73.4
GSM8K (5-shot) 4.62
DROP (3-shot) 6.28