Model Description

This is an experiment to compare merging 2 models using DARE TIES versus SLERP 🦙

We are mainly interested to compare against Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp

The 2 models involved in the merge as follows:

  1. teknium/OpenHermes-2.5-Mistral-7B
  2. Intel/neural-chat-7b-v3-3

The yaml config file for the merge is:

models:
  - model: mistralai/Mistral-7B-v0.1
    # no parameters necessary for base model
  - model: teknium/OpenHermes-2.5-Mistral-7B
    parameters:
      weight: 0.5
      density: 0.5
  - model: Intel/neural-chat-7b-v3-3
    parameters:
      weight: 0.5
      density: 0.5
merge_method: dare_ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
  int8_mask: true
dtype: bfloat16

Open LLM Leaderboard

Note that with more tuning DARE TIES might achieve better results.

DARE TIES SLERP
Average 70.69 71.38
ARC 67.49 68.09
HellaSwag 85.78 86.2
MMLU 64.1 64.26
TruthfulQA 60.52 62.78
Winogrande 79.01 79.16
GSM8K 67.25 67.78
Downloads last month
65
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for EmbeddedLLM/Mistral-7B-Merge-02-v0