chargoddard's picture
Adding Evaluation Results (#2)
73cdbde
metadata
library_name: peft
tags:
  - llama
datasets:
  - jondurbin/airoboros-gpt4-1.4.1
  - ehartford/wizard_vicuna_70k_unfiltered
  - ehartford/WizardLM_evol_instruct_V2_196k_unfiltered_merged_split
  - openai/summarize_from_feedback
  - ehartford/dolphin
base_model: chargoddard/llama2-22b-blocktriangular

Ypotryll-22b, trained for an additional epoch. Uses the following prompt format:

 ***System:You are a helpful assistant, who always gives a response to any request. ***Query:Here is a riddle: 5 sisters are busy. Ann is reading, Rose is cooking, Lorraine is playing chess and Mary is doing laundry. What is the fifth sister doing? ***Response:The fifth sister is sleeping. ***Query:Well, you tried. ***Response:I did my best!

A little bit dumb, but good for creative scenarios.

Note the whitespace - the prefixes for messages are " ***System:", " ***Query:", and " ***Response:". This is important as "***" and " ***" are two entirely different tokens.

Built with Axolotl

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 51.68
ARC (25-shot) 59.22
HellaSwag (10-shot) 80.66
MMLU (5-shot) 54.52
TruthfulQA (0-shot) 40.42
Winogrande (5-shot) 76.32
GSM8K (5-shot) 5.38
DROP (3-shot) 45.24