Weyaxi's picture
Adding Evaluation Results (#1)
c730545

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 51.13
ARC (25-shot) 60.58
HellaSwag (10-shot) 82.56
MMLU (5-shot) 58.25
TruthfulQA (0-shot) 54.77
Winogrande (5-shot) 74.9
GSM8K (5-shot) 0.91
DROP (3-shot) 25.96