Edit model card

Model Description

This is an experiment to test merging 14 models using DARE TIES 🦙

We first merge 14 models to produce EmbeddedLLM/Mistral-7B-Merge-14-v0.3, which is then merged again with Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp using Gradient SLERP. The result is a model that performs quite well but may require further instruction fine-tuning.

Open LLM Leaderboard

Average 71.19
ARC 66.81
HellaSwag 86.15
MMLU 65.10
TruthfulQA 58.25
Winogrande 80.03
GSM8K 70.81

Chat Template

Either ChatML or Llama-2 chat template.

Merge Configuration

The merge config file for this model is here:

slices:
  - sources:
      - model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp
        layer_range: [0, 32]
      - model: EmbeddedLLM/Mistral-7B-Merge-14-v0.3
        layer_range: [0, 32]

merge_method: slerp
base_model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-openchat-3.5-1210-Slerp

parameters:
  t:
    - filter: self_attn
      value: [0, 0.5, 0.3, 0.7, 1]
    - filter: mlp
      value: [1, 0.5, 0.7, 0.3, 0]
    - value: 0.5 # fallback for rest of tensors
tokenizer_source: base
embed_slerp: true

dtype: bfloat16
Downloads last month
1,994
Safetensors
Model size
7.24B params
Tensor type
BF16
·

Merge of