File size: 5,084 Bytes
47f2c59 7546f08 47f2c59 7546f08 47f2c59 7546f08 5eb2ac6 7546f08 47f2c59 7546f08 47f2c59 7546f08 47f2c59 7546f08 47f2c59 7546f08 47f2c59 7546f08 47f2c59 7546f08 47f2c59 7546f08 5eb2ac6 47f2c59 7546f08 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 |
---
language:
- en
license: apache-2.0
library_name: transformers
tags:
- mergekit
- merge
base_model:
- unsloth/Mistral-Small-24B-Base-2501
- unsloth/Mistral-Small-24B-Instruct-2501
- trashpanda-org/MS-24B-Instruct-Mullein-v0
- trashpanda-org/Llama3-24B-Mullein-v1
- ArliAI/Mistral-Small-24B-ArliAI-RPMax-v1.4
- TheDrummer/Cydonia-24B-v2
- estrogen/MS2501-24b-Ink-apollo-ep2
- huihui-ai/Mistral-Small-24B-Instruct-2501-abliterated
- ToastyPigeon/ms3-roselily-rp-v2
- PocketDoc/Dans-DangerousWinds-V1.1.1-24b
- ReadyArt/Forgotten-Safeword-24B-V2.2
---
***
### Overview
One of the merging steps for [Tantum](https://huggingface.co/Nohobby/MS3-Tantum-24B-v0.1). Might be better than the end result
## Model files may not be downloadable
You can get full-weight files from here: https://huggingface.co/mergekit-community/MS-RP-whole
This happened because I was using the mergekit-gui space for merging and got lazy about manually dragging the intermediate steps to my org, so I just set it to upload to mergekit-community. When I learned that this thing was usable on it's own, I decided to add some info to the model card and duplicated the repo here before linking it in the Tantum readme file.
yeah
**Settings:**
Samplers: [Weird preset](https://files.catbox.moe/ccwmca.json) | [Forgotten-Safeword preset](https://huggingface.co/sleepdeprived3/Mistral-V7-Tekken-Extra-Dry)
Prompt format: Mistral-V7-Tekken (?)
I use [this](https://files.catbox.moe/daluze.json) lorebook for all chats instead of a system prompt for mistal models.
### Quants
[Static](https://huggingface.co/mradermacher/MS-RP-whole-GGUF) | [Imatrix](https://huggingface.co/mradermacher/MS-RP-whole-i1-GGUF)
***
## Merge Details
### Merging steps
## MS3-test-Merge-1
```yaml
models:
- model: unsloth/Mistral-Small-24B-Base-2501
- model: unsloth/Mistral-Small-24B-Instruct-2501+ToastyPigeon/new-ms-rp-test-ws
parameters:
select_topk:
- value: [0.05, 0.03, 0.02, 0.02, 0.01]
- model: unsloth/Mistral-Small-24B-Instruct-2501+estrogen/MS2501-24b-Ink-ep2-adpt
parameters:
select_topk: 0.1
- model: trashpanda-org/MS-24B-Instruct-Mullein-v0
parameters:
select_topk: 0.4
base_model: unsloth/Mistral-Small-24B-Base-2501
merge_method: sce
parameters:
int8_mask: true
rescale: true
normalize: true
dtype: bfloat16
tokenizer_source: base
```
```yaml
dtype: bfloat16
tokenizer_source: base
merge_method: della_linear
parameters:
density: 0.55
base_model: Step1
models:
- model: unsloth/Mistral-Small-24B-Instruct-2501
parameters:
weight:
- filter: v_proj
value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
- filter: o_proj
value: [1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1]
- filter: up_proj
value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
- filter: gate_proj
value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
- filter: down_proj
value: [1, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0]
- value: 0
- model: Step1
parameters:
weight:
- filter: v_proj
value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1]
- filter: o_proj
value: [0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0]
- filter: up_proj
value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
- filter: gate_proj
value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1]
- filter: down_proj
value: [0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1]
- value: 1
```
Some early MS3 merge. Not really worth using on its own. Just added it for fun.
## RP-half1
```yaml
models:
- model: ArliAI/Mistral-Small-24B-ArliAI-RPMax-v1.4
parameters:
weight: 0.2
density: 0.7
- model: trashpanda-org/Llama3-24B-Mullein-v1
parameters:
weight: 0.2
density: 0.7
- model: TheDrummer/Cydonia-24B-v2
parameters:
weight: 0.2
density: 0.7
merge_method: della_linear
base_model: Nohobby/MS3-test-Merge-1
parameters:
epsilon: 0.2
lambda: 1.1
dtype: bfloat16
tokenizer:
source: base
```
## RP-half2
```yaml
base_model: Nohobby/MS3-test-Merge-1
parameters:
epsilon: 0.05
lambda: 0.9
int8_mask: true
rescale: true
normalize: false
dtype: bfloat16
tokenizer:
source: base
merge_method: della
models:
- model: estrogen/MS2501-24b-Ink-apollo-ep2
parameters:
weight: [0.1, -0.01, 0.1, -0.02, 0.1]
density: [0.6, 0.4, 0.5, 0.4, 0.6]
- model: huihui-ai/Mistral-Small-24B-Instruct-2501-abliterated
parameters:
weight: [0.02, -0.01, 0.02, -0.02, 0.01]
density: [0.45, 0.55, 0.45, 0.55, 0.45]
- model: ToastyPigeon/ms3-roselily-rp-v2
parameters:
weight: [0.01, -0.02, 0.02, -0.025, 0.01]
density: [0.45, 0.65, 0.45, 0.65, 0.45]
- model: PocketDoc/Dans-DangerousWinds-V1.1.1-24b
parameters:
weight: [0.1, -0.01, 0.1, -0.02, 0.1]
density: [0.6, 0.4, 0.5, 0.4, 0.6]
```
## RP-broth/MS-RP-whole
```yaml
base_model: ReadyArt/Forgotten-Safeword-24B-V2.2
merge_method: model_stock
dtype: bfloat16
models:
- model: mergekit-community/MS3-RP-half1
- model: mergekit-community/MS3-RP-RP-half2
``` |