win10's picture
Upload folder using huggingface_hub
92cfa71 verified
|
raw
history blame
No virus
2.24 kB
---
license: apache-2.0
tags:
- merge
- mergekit
- lazymergekit
- deepseek-ai/deepseek-llm-7b-base
---
# Breeze-13B-32k-Base-v1_0
Breeze-13B-32k-Base-v1_0 is a merge of the following models using [mergekit](https://github.com/cg123/mergekit):
* [deepseek-ai/deepseek-llm-7b-base](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base)
* [deepseek-ai/deepseek-llm-7b-base](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base)
* [deepseek-ai/deepseek-llm-7b-base](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base)
* [deepseek-ai/deepseek-llm-7b-base](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base)
* [deepseek-ai/deepseek-llm-7b-base](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base)
* [deepseek-ai/deepseek-llm-7b-base](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base)
* [deepseek-ai/deepseek-llm-7b-base](https://huggingface.co/deepseek-ai/deepseek-llm-7b-base)
## 🧩 Configuration
```yaml
dtype: bfloat16
merge_method: linear
slices:
- sources:
- layer_range: [0, 8]
model: deepseek-ai/deepseek-llm-7b-base
- layer_range: [0, 8]
model: meta-llama/Meta-Llama-3-8B
parameters:
weight: 0
- sources:
- layer_range: [4, 12]
model: deepseek-ai/deepseek-llm-7b-base
- layer_range: [4, 12]
model: meta-llama/Meta-Llama-3-8B
parameters:
weight: 0
- sources:
- layer_range: [8, 16]
model: deepseek-ai/deepseek-llm-7b-base
- layer_range: [8, 16]
model: meta-llama/Meta-Llama-3-8B
parameters:
weight: 0
- sources:
- layer_range: [12, 20]
model: deepseek-ai/deepseek-llm-7b-base
- layer_range: [12, 20]
model: meta-llama/Meta-Llama-3-8B
parameters:
weight: 0
- sources:
- layer_range: [16, 24]
model: deepseek-ai/deepseek-llm-7b-base
- layer_range: [16, 24]
model: meta-llama/Meta-Llama-3-8B
parameters:
weight: 0
- sources:
- layer_range: [20, 28]
model: deepseek-ai/deepseek-llm-7b-base
- layer_range: [20, 28]
model: meta-llama/Meta-Llama-3-8B
parameters:
weight: 0
- sources:
- layer_range: [24, 32]
model: deepseek-ai/deepseek-llm-7b-base
- layer_range: [24, 32]
model: meta-llama/Meta-Llama-3-8B
parameters:
weight: 0
tokenizer_source: union
```