HachiML's picture
Update README.md
0a80b15 verified
---
base_model: []
library_name: transformers
tags:
- mergekit
- merge
---
# final_merge
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
## Merge Details
### Merge Method
This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 as a base.
### Models Merged
The following models were included in the merge:
* ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
* ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
### Evolve Configuration
```yaml
genome:
models:
- tokyotech-llm/Swallow-MS-7b-v0.1
- Nexusflow/Starling-LM-7B-beta
- mistralai/Mistral-7B-Instruct-v0.2
merge_method: dare_ties
base_model: tokyotech-llm/Swallow-MS-7b-v0.1
tokenizer_source: base
layer_granularity: 4 # sane default
normalize: true
allow_negative_weights: true # useful with task_arithmetic
tasks:
- name: elyzatasks100
weight: 1.0
```
### Configuration
The following YAML configuration was used to produce this model:
```yaml
base_model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
dtype: bfloat16
merge_method: dare_ties
parameters:
int8_mask: 1.0
normalize: 1.0
slices:
- sources:
- layer_range: [0, 4]
model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
parameters:
density: 1.0
weight: 0.20736632024943663
- layer_range: [0, 4]
model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
parameters:
density: 1.0
weight: 0.2876973518761861
- layer_range: [0, 4]
model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
parameters:
density: 1.0
weight: 0.39790911189850287
- sources:
- layer_range: [4, 8]
model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
parameters:
density: 1.0
weight: 0.3259754595200053
- layer_range: [4, 8]
model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
parameters:
density: 1.0
weight: 0.36312222325553534
- layer_range: [4, 8]
model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
parameters:
density: 0.8606873476749896
weight: 0.13151678264284256
- sources:
- layer_range: [8, 12]
model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
parameters:
density: 1.0
weight: 0.16690975724594306
- layer_range: [8, 12]
model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
parameters:
density: 0.8737746997323794
weight: 0.5267457266976868
- layer_range: [8, 12]
model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
parameters:
density: 1.0
weight: 0.37203078821341173
- sources:
- layer_range: [12, 16]
model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
parameters:
density: 0.9041657657943898
weight: 0.411866096762198
- layer_range: [12, 16]
model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
parameters:
density: 0.8768235480939731
weight: 0.24309153870225503
- layer_range: [12, 16]
model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
parameters:
density: 1.0
weight: 0.40805997159088514
- sources:
- layer_range: [16, 20]
model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
parameters:
density: 1.0
weight: 0.20153807161142293
- layer_range: [16, 20]
model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
parameters:
density: 1.0
weight: 0.2651496946837373
- layer_range: [16, 20]
model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
parameters:
density: 0.881089793974409
weight: 0.018551645245409754
- sources:
- layer_range: [20, 24]
model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
parameters:
density: 1.0
weight: 0.05396099731564888
- layer_range: [20, 24]
model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
parameters:
density: 1.0
weight: 0.2544355076223701
- layer_range: [20, 24]
model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
parameters:
density: 1.0
weight: 0.17428773365086464
- sources:
- layer_range: [24, 28]
model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
parameters:
density: 0.9948454730348346
weight: 0.13561950438761128
- layer_range: [24, 28]
model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
parameters:
density: 0.9012771361348846
weight: 0.21474768477949524
- layer_range: [24, 28]
model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
parameters:
density: 0.5686565104560466
weight: 0.5862075607169237
- sources:
- layer_range: [28, 32]
model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
parameters:
density: 0.7293804704051091
weight: 0.5832263789977623
- layer_range: [28, 32]
model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
parameters:
density: 1.0
weight: 0.25251733788362796
- layer_range: [28, 32]
model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
parameters:
density: 1.0
weight: 0.7295319486730514
tokenizer_source: base
```