--- base_model: [] library_name: transformers tags: - mergekit - merge --- # final_merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 as a base. ### Models Merged The following models were included in the merge: * ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980 * ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087 ### Evolve Configuration ```yaml genome: models: - tokyotech-llm/Swallow-MS-7b-v0.1 - Nexusflow/Starling-LM-7B-beta - mistralai/Mistral-7B-Instruct-v0.2 merge_method: dare_ties base_model: tokyotech-llm/Swallow-MS-7b-v0.1 tokenizer_source: base layer_granularity: 4 # sane default normalize: true allow_negative_weights: true # useful with task_arithmetic tasks: - name: elyzatasks100 weight: 1.0 ``` ### Configuration The following YAML configuration was used to produce this model: ```yaml base_model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 dtype: bfloat16 merge_method: dare_ties parameters: int8_mask: 1.0 normalize: 1.0 slices: - sources: - layer_range: [0, 4] model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 parameters: density: 1.0 weight: 0.20736632024943663 - layer_range: [0, 4] model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980 parameters: density: 1.0 weight: 0.2876973518761861 - layer_range: [0, 4] model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087 parameters: density: 1.0 weight: 0.39790911189850287 - sources: - layer_range: [4, 8] model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 parameters: density: 1.0 weight: 0.3259754595200053 - layer_range: [4, 8] model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980 parameters: density: 1.0 weight: 0.36312222325553534 - layer_range: [4, 8] model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087 parameters: density: 0.8606873476749896 weight: 0.13151678264284256 - sources: - layer_range: [8, 12] model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 parameters: density: 1.0 weight: 0.16690975724594306 - layer_range: [8, 12] model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980 parameters: density: 0.8737746997323794 weight: 0.5267457266976868 - layer_range: [8, 12] model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087 parameters: density: 1.0 weight: 0.37203078821341173 - sources: - layer_range: [12, 16] model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 parameters: density: 0.9041657657943898 weight: 0.411866096762198 - layer_range: [12, 16] model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980 parameters: density: 0.8768235480939731 weight: 0.24309153870225503 - layer_range: [12, 16] model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087 parameters: density: 1.0 weight: 0.40805997159088514 - sources: - layer_range: [16, 20] model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 parameters: density: 1.0 weight: 0.20153807161142293 - layer_range: [16, 20] model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980 parameters: density: 1.0 weight: 0.2651496946837373 - layer_range: [16, 20] model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087 parameters: density: 0.881089793974409 weight: 0.018551645245409754 - sources: - layer_range: [20, 24] model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 parameters: density: 1.0 weight: 0.05396099731564888 - layer_range: [20, 24] model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980 parameters: density: 1.0 weight: 0.2544355076223701 - layer_range: [20, 24] model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087 parameters: density: 1.0 weight: 0.17428773365086464 - sources: - layer_range: [24, 28] model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 parameters: density: 0.9948454730348346 weight: 0.13561950438761128 - layer_range: [24, 28] model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980 parameters: density: 0.9012771361348846 weight: 0.21474768477949524 - layer_range: [24, 28] model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087 parameters: density: 0.5686565104560466 weight: 0.5862075607169237 - sources: - layer_range: [28, 32] model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 parameters: density: 0.7293804704051091 weight: 0.5832263789977623 - layer_range: [28, 32] model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980 parameters: density: 1.0 weight: 0.25251733788362796 - layer_range: [28, 32] model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087 parameters: density: 1.0 weight: 0.7295319486730514 tokenizer_source: base ```