BigWeave-v28-96b / README.md
llmixer's picture
Model upload
1077550 verified
|
raw
history blame
2.18 kB
metadata
base_model:
  - 152334H/miqu-1-70b-sf
license: unknown
language:
  - en
pipeline_tag: text-generation
tags:
  - merge
  - frankenmerge
  - 95b

BigWeave v28 96b

The BigWeave models aim to experimentally identify merge settings for increasing model performance. The version number merely tracks various attempts and is not a quality indicator. Only results demonstrating good performance are retained and shared.

Prompting Format

Chatml, Mistral, Vicuna.

Merge process

This is a self-merge of 152334H/miqu-1-70b-sf. The slices use a uniform size and only overlap with the adjacent sizes by one layer. See this discussion.

Merge configuration:

slices:
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [0,12]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [10,16]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [14,20]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [18,24]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [22,28]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [26,32]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [30,36]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [34,40]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [38,44]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [42,48]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [46,52]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [50,56]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [54,60]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [58,64]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [62,68]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [66,72]
  - sources:
    - model: 152334H/miqu-1-70b-sf
      layer_range: [70,80]
merge_method: passthrough
dtype: float16