Marx-3B / README.md
acrastt's picture
Adding Evaluation Results (#7)
d88e238
|
raw
history blame
1.58 kB
metadata
license: apache-2.0
datasets:
  - totally-not-an-llm/everything-sharegptformat-morecleaned
language:
  - en
pipeline_tag: text-generation

Buy Me A Coffee

This is OpenLLaMA 3B V2 finetuned on EverythingLM Data(ShareGPT format more cleaned) for 1 epochs.

Prompt template:

### HUMAN:
{prompt}

### RESPONSE:
<leave a newline for the model to answer>

GGML quants available here.
GPTQ quants available here.

Note: Don't expect this model to be good, I was just starting out to finetune. So don't roast me please!

Open LLM Leaderboard Evaluation Results

Detailed results can be found here

Metric Value
Avg. 41.71
ARC (25-shot) 43.17
HellaSwag (10-shot) 72.68
MMLU (5-shot) 28.46
TruthfulQA (0-shot) 39.09
Winogrande (5-shot) 65.59
GSM8K (5-shot) 1.29