leaderboard-pr-bot's picture
Adding Evaluation Results
3a05c73
|
raw
history blame
1.2 kB
metadata
license: apache-2.0
datasets:
  - WizardLM/WizardLM_evol_instruct_V2_196k
language:
  - en
library_name: transformers

Trained on 1 epoch of the WizardLM_evol_instruct_v2_196k dataset

Prompt template:

### HUMAN:
{prompt}

### RESPONSE:
<leave a newline for the model to answer>

Built with Axolotl

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 36.33
ARC (25-shot) 41.81
HellaSwag (10-shot) 73.01
MMLU (5-shot) 26.36
TruthfulQA (0-shot) 38.99
Winogrande (5-shot) 66.69
GSM8K (5-shot) 1.9
DROP (3-shot) 5.57