win10's picture
Upload folder using huggingface_hub
92cfa71 verified
|
raw
history blame
No virus
2.24 kB
metadata
license: apache-2.0
tags:
  - merge
  - mergekit
  - lazymergekit
  - deepseek-ai/deepseek-llm-7b-base

Breeze-13B-32k-Base-v1_0

Breeze-13B-32k-Base-v1_0 is a merge of the following models using mergekit:

🧩 Configuration

dtype: bfloat16
merge_method: linear
slices:
- sources:
  - layer_range: [0, 8]
    model: deepseek-ai/deepseek-llm-7b-base
  - layer_range: [0, 8]
    model: meta-llama/Meta-Llama-3-8B
    parameters:
      weight: 0
- sources:
  - layer_range: [4, 12]
    model: deepseek-ai/deepseek-llm-7b-base
  - layer_range: [4, 12]
    model: meta-llama/Meta-Llama-3-8B
    parameters:
      weight: 0
- sources:
  - layer_range: [8, 16]
    model: deepseek-ai/deepseek-llm-7b-base
  - layer_range: [8, 16]
    model: meta-llama/Meta-Llama-3-8B
    parameters:
      weight: 0
- sources:
  - layer_range: [12, 20]
    model: deepseek-ai/deepseek-llm-7b-base
  - layer_range: [12, 20]
    model: meta-llama/Meta-Llama-3-8B
    parameters:
      weight: 0
- sources:
  - layer_range: [16, 24]
    model: deepseek-ai/deepseek-llm-7b-base
  - layer_range: [16, 24]
    model: meta-llama/Meta-Llama-3-8B
    parameters:
      weight: 0
- sources:
  - layer_range: [20, 28]
    model: deepseek-ai/deepseek-llm-7b-base
  - layer_range: [20, 28]
    model: meta-llama/Meta-Llama-3-8B
    parameters:
      weight: 0
- sources:
  - layer_range: [24, 32]
    model: deepseek-ai/deepseek-llm-7b-base
  - layer_range: [24, 32]
    model: meta-llama/Meta-Llama-3-8B
    parameters:
      weight: 0
tokenizer_source: union