Edit model card

WestSeverus - 7B - DPO - v2

image/png

☘️ Model Description

WestSeverus-7B-DPO-v2 is a WestLake Family model trained over WestSeverus-7B.

The model was trained on several dpo datasets and it can perform well on basic math problem.

WestSeverus-7B-DPO-v2 can be used in mathematics, chemical, physics and even coding for further research and reference.

πŸ“– Table of Contents

  1. Nous Benchmark Results

    • AGIEval
    • GPT4All
    • TruthfulQA Scores
    • BigBench
  2. Open LLM Leaderboard

    • ARC
    • HellaSwag
    • MMLU
    • TruthfulQA
    • Winogrande
    • GSM8K
  3. EvalPlus Leaderboard

    • HumanEval
    • HumanEval_Plus
    • MBPP
    • MBPP_Plus
  4. Prompt Format

  5. Quantized Models

  6. Gratitude

πŸͺ„ Nous Benchmark Results

WestSeverus-7B-DPO-v2 is currently on the top of the YALL - Yet Another LLM Leaderboard created by CultriX and it outperforms on TruthfulQA Scores and BigBench.

Model Average AGIEval GPT4All TruthfulQA Bigbench
WestSeverus-7B-DPO-v2 60.98 45.29 77.2 72.72 48.71
CultriX/Wernicke-7B-v1 60.73 45.59 77.36 71.46 48.49
mlabonne/NeuralBeagle14-7B 60.25 46.06 76.77 70.32 47.86
CultriX/MistralTrix-v1 60.05 44.98 76.62 71.44 47.17
senseable/WestLake-7B-v2 59.42 44.27 77.86 67.46 48.09
mlabonne/Daredevil-7B 58.22 44.85 76.07 64.89 47.07
microsoft/phi-2 44.61 27.96 70.84 44.46 35.17

πŸ† Open LLM Leaderboard

WestSeverus-7B-DPO-v2 is one of the top 7B model in Open LLM Leaderboard and it outperforms on TruthfulQA and GSM8K.

Metric Value
Avg. 75.29
AI2 Reasoning Challenge (25-Shot) 71.42
HellaSwag (10-Shot) 88.27
MMLU (5-Shot) 64.79
TruthfulQA (0-shot) 72.37
Winogrande (5-shot) 83.27
GSM8k (5-shot) 71.65

Detailed results can be found here

⚑ EvalPlus Leaderboard

Model HumanEval HumanEval_Plus MBPP MBPP_Plus
phi-2-2.7B 48.2 43.3 61.9 51.4
WestSeverus-7B-DPO-v2 43.3 34.1 TBD TBD
SOLAR-10.7B-Instruct-v1.0 42.1 34.3 42.9 34.6
CodeLlama-7B 37.8 34.1 57.6 45.4

image/png

βš—οΈ Prompt Format

WestSeverus-7B-DPO-v2 was trained using the ChatML prompt templates with system prompts. An example follows below:

<|im_start|>system
{system_message}<|im_end|>
<|im_start|>user
{prompt}<|im_end|>
<|im_start|>assistant

πŸ› οΈ Quantized Models

Another version of WestSeverus Model:

MaziyarPanahi/WestSeverus-7B-DPO-v2-GGUF

πŸ™ Gratitude

  • Thanks to @senseable for senseable/WestLake-7B-v2.
  • Thanks to @jondurbin for jondurbin/truthy-dpo-v0.1 dataset.
  • Thanks to @Charles Goddard for MergeKit.
  • Thanks to @TheBloke, @s3nh, @MaziyarPanahi for Quantized Models.
  • Thanks to @mlabonne, @CultriX for YALL - Yet Another LLM Leaderboard.
  • Thank you to all the other people in the Open Source AI community who utilized this model for further research and improvement.
Downloads last month
2,794
Safetensors
Model size
7.24B params
Tensor type
FP16
Β·