Edit model card

Model Description

This is an experiment to compare merging 2 models using DARE TIES versus SLERP 🦙

We are mainly interested to compare against Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp

The 2 models involved in the merge as follows:

  1. teknium/OpenHermes-2.5-Mistral-7B
  2. Intel/neural-chat-7b-v3-3

The yaml config file for the merge is:

models:
  - model: mistralai/Mistral-7B-v0.1
    # no parameters necessary for base model
  - model: teknium/OpenHermes-2.5-Mistral-7B
    parameters:
      weight: 0.5
      density: 0.5
  - model: Intel/neural-chat-7b-v3-3
    parameters:
      weight: 0.5
      density: 0.5
merge_method: dare_ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
  int8_mask: true
dtype: bfloat16

Open LLM Leaderboard

Note that with more tuning DARE TIES might achieve better results.

DARE TIES SLERP
Average 70.69 71.38
ARC 67.49 68.09
HellaSwag 85.78 86.2
MMLU 64.1 64.26
TruthfulQA 60.52 62.78
Winogrande 79.01 79.16
GSM8K 67.25 67.78
Downloads last month
2,480
Safetensors
Model size
7.24B params
Tensor type
BF16
·

Merge of