Lazycuber's picture
Adding Evaluation Results (#1)
dfdad19 verified
|
raw
history blame contribute delete
No virus
920 Bytes
metadata
datasets:
  - Fredithefish/openassistant-guanaco-unfiltered
language:
  - en
library_name: transformers
inference: false

I have no idea what I'm doing

Anyways I finetune Llama 2 7b base hf with Guanaco Unfiltered dataset

It's probably horrible

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 44.06
ARC (25-shot) 52.22
HellaSwag (10-shot) 79.08
MMLU (5-shot) 46.63
TruthfulQA (0-shot) 42.97
Winogrande (5-shot) 74.51
GSM8K (5-shot) 7.28
DROP (3-shot) 5.75