leaderboard-pr-bot's picture
Adding Evaluation Results
03ab4b5
|
raw
history blame
925 Bytes
metadata
license: other

Airoboros 33b GPT4 1.2 merged with kaiokendev's 33b SuperHOT 8k LoRA, without quant. (Full FP16 model)

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 25.35
ARC (25-shot) 24.66
HellaSwag (10-shot) 31.23
MMLU (5-shot) 23.13
TruthfulQA (0-shot) 47.44
Winogrande (5-shot) 50.43
GSM8K (5-shot) 0.0
DROP (3-shot) 0.59