HachiML's picture
Update README.md
0a80b15 verified
|
raw
history blame
5.73 kB
metadata
base_model: []
library_name: transformers
tags:
  - mergekit
  - merge

final_merge

This is a merge of pre-trained language models created using mergekit.

Merge Details

Merge Method

This model was merged using the DARE TIES merge method using ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065 as a base.

Models Merged

The following models were included in the merge:

  • ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
  • ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087

Evolve Configuration

genome:
    models:
      - tokyotech-llm/Swallow-MS-7b-v0.1
      - Nexusflow/Starling-LM-7B-beta
      - mistralai/Mistral-7B-Instruct-v0.2
    merge_method: dare_ties
    base_model: tokyotech-llm/Swallow-MS-7b-v0.1
    tokenizer_source: base
    layer_granularity: 4 # sane default
    normalize: true
    allow_negative_weights: true # useful with task_arithmetic
tasks:
  - name: elyzatasks100
    weight: 1.0

Configuration

The following YAML configuration was used to produce this model:

base_model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
dtype: bfloat16
merge_method: dare_ties
parameters:
  int8_mask: 1.0
  normalize: 1.0
slices:
- sources:
  - layer_range: [0, 4]
    model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
    parameters:
      density: 1.0
      weight: 0.20736632024943663
  - layer_range: [0, 4]
    model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
    parameters:
      density: 1.0
      weight: 0.2876973518761861
  - layer_range: [0, 4]
    model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
    parameters:
      density: 1.0
      weight: 0.39790911189850287
- sources:
  - layer_range: [4, 8]
    model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
    parameters:
      density: 1.0
      weight: 0.3259754595200053
  - layer_range: [4, 8]
    model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
    parameters:
      density: 1.0
      weight: 0.36312222325553534
  - layer_range: [4, 8]
    model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
    parameters:
      density: 0.8606873476749896
      weight: 0.13151678264284256
- sources:
  - layer_range: [8, 12]
    model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
    parameters:
      density: 1.0
      weight: 0.16690975724594306
  - layer_range: [8, 12]
    model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
    parameters:
      density: 0.8737746997323794
      weight: 0.5267457266976868
  - layer_range: [8, 12]
    model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
    parameters:
      density: 1.0
      weight: 0.37203078821341173
- sources:
  - layer_range: [12, 16]
    model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
    parameters:
      density: 0.9041657657943898
      weight: 0.411866096762198
  - layer_range: [12, 16]
    model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
    parameters:
      density: 0.8768235480939731
      weight: 0.24309153870225503
  - layer_range: [12, 16]
    model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
    parameters:
      density: 1.0
      weight: 0.40805997159088514
- sources:
  - layer_range: [16, 20]
    model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
    parameters:
      density: 1.0
      weight: 0.20153807161142293
  - layer_range: [16, 20]
    model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
    parameters:
      density: 1.0
      weight: 0.2651496946837373
  - layer_range: [16, 20]
    model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
    parameters:
      density: 0.881089793974409
      weight: 0.018551645245409754
- sources:
  - layer_range: [20, 24]
    model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
    parameters:
      density: 1.0
      weight: 0.05396099731564888
  - layer_range: [20, 24]
    model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
    parameters:
      density: 1.0
      weight: 0.2544355076223701
  - layer_range: [20, 24]
    model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
    parameters:
      density: 1.0
      weight: 0.17428773365086464
- sources:
  - layer_range: [24, 28]
    model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
    parameters:
      density: 0.9948454730348346
      weight: 0.13561950438761128
  - layer_range: [24, 28]
    model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
    parameters:
      density: 0.9012771361348846
      weight: 0.21474768477949524
  - layer_range: [24, 28]
    model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
    parameters:
      density: 0.5686565104560466
      weight: 0.5862075607169237
- sources:
  - layer_range: [28, 32]
    model: ../evol_merge_storage/input_models/Swallow-MS-7b-v0.1_259979065
    parameters:
      density: 0.7293804704051091
      weight: 0.5832263789977623
  - layer_range: [28, 32]
    model: ../evol_merge_storage/input_models/Starling-LM-7B-beta_581094980
    parameters:
      density: 1.0
      weight: 0.25251733788362796
  - layer_range: [28, 32]
    model: ../evol_merge_storage/input_models/Mistral-7B-Instruct-v0.2_674785087
    parameters:
      density: 1.0
      weight: 0.7295319486730514
tokenizer_source: base