File size: 1,749 Bytes
c7f2701 ba10c40 c7f2701 ba10c40 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
---
license: other
license_name: tongyi-qianwen
license_link: https://huggingface.co/Qwen/Qwen1.5-72B-Chat/blob/main/LICENSE
tags:
- merge
- mergekit
- qwen2
- chat
- conversational
language:
- en
- chi
library_name: transformers
---
# Qwen1.5-124B-Chat-Merge
**--This is a 124b frankenmerge of [qwen1.5-72B-Chat](https://huggingface.co/Qwen/Qwen1.5-72B-Chat) created by interleaving layers of [qwen1.5-72B-Chat](https://huggingface.co/Qwen/Qwen1.5-72B-Chat) with itself using mergekit.--**
*Inspired by other frankenmerge models like [**goliath-120b**](https://huggingface.co/alpindale/goliath-120b) and [**miqu-1-120b**](https://huggingface.co/wolfram/miqu-1-120b)*
**-Quantize**
*Coming soon...*
**-Merge Configuration**
This yaml below:
```yaml
dtype: float16
merge_method: passthrough
slices:
- sources:
- layer_range: [0, 20]
model: Qwen/Qwen1.5-72B-Chat
- sources:
- layer_range: [10, 30]
model: Qwen/Qwen1.5-72B-Chat
- sources:
- layer_range: [20, 40]
model: Qwen/Qwen1.5-72B-Chat
- sources:
- layer_range: [30, 50]
model: Qwen/Qwen1.5-72B-Chat
- sources:
- layer_range: [40, 60]
model: Qwen/Qwen1.5-72B-Chat
- sources:
- layer_range: [50, 70]
model: Qwen/Qwen1.5-72B-Chat
- sources:
- layer_range: [60, 80]
model: Qwen/Qwen1.5-72B-Chat
```
**-Performance**
* Tips:I don't have the capability to conduct benchmark tests, nor can I even use it extensively enough, so my test results might not be entirely accurate.
It has better performance than the 72B version in most of my own tests (subjective) including comprehension, reasoning and coherence.
**-Thanks**
* 1.The tool used to merge this model [mergekit](https://github.com/arcee-ai/mergekit)
* 2.Qwen team for the excellent base models. |