Stheno-1.3-L2-13B / README.md
Sao10K's picture
Adding Evaluation Results (#1)
e0660af
metadata
license: llama2
language:
  - en

A Gradient Merge of Stheno-P1 and Stheno-P2, using BlockMerge_Gradient using a script modified by @Vali to replace the tensor calculations with SLERP instead.

So far its pretty good in personal tests.

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 49.32
ARC (25-shot) 56.83
HellaSwag (10-shot) 81.7
MMLU (5-shot) 52.79
TruthfulQA (0-shot) 50.23
Winogrande (5-shot) 71.11
GSM8K (5-shot) 0.23
DROP (3-shot) 32.34