File size: 5,660 Bytes
26718a2
 
 
 
 
 
 
 
 
be28f86
 
 
 
 
206a430
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26718a2
206a430
26718a2
c2b5734
26718a2
8414d47
2517854
26718a2
 
 
 
206a430
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26718a2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
206a430
 
 
 
 
 
 
 
 
c2b5734
206a430
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
---
tags:
- merge
- mergekit
- jondurbin/bagel-dpo-34b-v0.2
- abacusai/MetaMath-Bagel-DPO-34B
base_model:
- jondurbin/bagel-dpo-34b-v0.2
- abacusai/MetaMath-Bagel-DPO-34B
license: apache-2.0
language:
- en
library_name: transformers
pipeline_tag: text-generation
model-index:
- name: Pearl-7B-0211-ties
  results:
  - task:
      type: text-generation
    metrics:
    - name: Average
      type: Average
      value: 75.48
    - name: ARC
      type: ARC
      value: 70.99
    - name: GSM8K
      type: GSM8K
      value: 67.48
    - name: Winogrande
      type: Winogrande
      value: 82.64
    - name: TruthfulQA
      type: TruthfulQA
      value: 70.32
    - name: HellaSwag
      type: HellaSwag
      value: 84.83
    - name: MMLU
      type: MMLU
      value: 76.63
    source:
      name: Open LLM Leaderboard
      url: https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard
---
<center><img src='https://i.imgur.com/0xFTuAX.png' width='450px'></center>

# Pearl-34B-ties, an xtraordinary 34B model

**03-22-2024 - To date, louisbrulenaudet/Pearl-34B-ties is the "Best 🤝 base merges and moerges model of around 30B" on the Open LLM Leaderboard.**

Pearl-34B-ties is a merge of the following models:
* [jondurbin/bagel-dpo-34b-v0.2](https://huggingface.co/jondurbin/bagel-dpo-34b-v0.2)
* [abacusai/MetaMath-Bagel-DPO-34B](https://huggingface.co/abacusai/MetaMath-Bagel-DPO-34B)

## Evaluation

The evaluation was performed using the HuggingFace Open LLM Leaderboard.

| Model                                            | Average | ARC   | HellaSwag | MMLU  | TruthfulQA | Winogrande | GSM8K | #Params (B) |
|--------------------------------------------------|---------|-------|-----------|-------|------------|------------|-------|--------------|
| **louisbrulenaudet/Pearl-34B-ties**             | **75.48** | 70.99 | 84.83 | **76.63** | 70.32 | 82.64 | 67.48 | 34.39 |
| **louisbrulenaudet/Pearl-7B-0211-ties**         | **75.11** | **71.42** | **88.86** | 63.91 | **71.46** | **84.37** | 70.66 | 7.24 |
| NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO     | 73.35   | 71.08 | 87.29 | 72.17 | 54.83 | 83.11 | 71.65 | 46.7 |
| argilla/notus-8x7b-experiment                   | 73.18   | 70.99 | 87.73 | 71.33 | 65.79 | 81.61 | 61.64 | 46.7 |
| **louisbrulenaudet/Pearl-7B-slerp**             | 72.75   | 68.00 | 87.16 | 64.04 | 62.35 | 81.29 | **73.62** | 7.24 |
| mistralai/Mixtral-8x7B-Instruct-v0.1            | 72.7    | 70.14 | 87.55 | 71.4  | 64.98 | 81.06 | 61.11 | 46.7 |
| microsoft/Orca-2-13b                            | 61.98   | 60.92 | 79.85 | 60.3  | 56.42 | 76.56 | 37.83 | 13 |
| microsoft/phi-2                                 | 61.33   | 61.09 | 75.11 | 58.11 | 44.47 | 74.35 | 54.81 | 2.78 |

### Ties merging

TIES-Merging is a method designed to facilitate the efficient merging of multiple task-specific models into a consolidated multitask model. It addresses two primary challenges encountered in the process of model merging with a focus on maintaining objectivity.

One key challenge tackled by TIES-Merging involves addressing redundancy in model parameters. This is achieved by identifying and eliminating redundant parameters within task-specific models, emphasizing the changes made during fine-tuning and selectively retaining the top-k% most significant changes while discarding the rest.

Another challenge pertains to conflicts arising from disagreements between parameter signs across different models. TIES-Merging resolves these conflicts by creating a unified sign vector representing the most dominant direction of change across all models.

The TIES-Merging process consists of three steps:

- Trim: Reduces redundancy in task-specific models by retaining a fraction of the most significant parameters (density parameter) and resetting the remaining parameters to zero.
- Elect Sign: Resolves sign conflicts across different models by creating a unified sign vector based on the most dominant direction (positive or negative) in terms of cumulative magnitude.
- Disjoint Merge: Averages parameter values aligned with the unified sign vector, excluding zero values.

## Configuration

```yaml
models:
  - model: abacusai/Smaug-34B-v0.1
  - model: jondurbin/bagel-dpo-34b-v0.2
    parameters:
      density: 0.45
      weight: 0.5
  - model: abacusai/MetaMath-Bagel-DPO-34B
    parameters:
      density: 0.48
      weight: 0.5
merge_method: ties
base_model: abacusai/Smaug-34B-v0.1
parameters:
  normalize: true
  int8_mask: true
dtype: bfloat16
```

## Usage

```python
!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "louisbrulenaudet/Pearl-34B-ties"
messages = [{"role": "user", "content": "What is a large language model?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])
```

## Citing & Authors

If you use this code in your research, please use the following BibTeX entry.

```BibTeX
@misc{louisbrulenaudet2023,
  author =       {Louis Brulé Naudet},
  title =        {Pearl-34B-ties, an xtraordinary 34B model},
  year =         {2023}
  howpublished = {\url{https://huggingface.co/louisbrulenaudet/Pearl-34B-ties}},
}
```

## Feedback

If you have any feedback, please reach out at [louisbrulenaudet@icloud.com](mailto:louisbrulenaudet@icloud.com).