DisOOM commited on
Commit
5af30d0
1 Parent(s): 1ea66fa

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +54 -1
README.md CHANGED
@@ -1,5 +1,58 @@
1
  ---
2
  license: other
3
  license_name: tongyi-qianwen
4
- license_link: LICENSE
 
 
 
 
 
 
 
 
 
 
5
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: other
3
  license_name: tongyi-qianwen
4
+ license_link: https://huggingface.co/Qwen/Qwen1.5-72B-Chat/blob/main/LICENSE
5
+ tags:
6
+ - merge
7
+ - mergekit
8
+ - qwen2
9
+ - chat
10
+ - conversational
11
+ language:
12
+ - en
13
+ - chi
14
+ library_name: transformers
15
  ---
16
+ # Qwen1.5-22B-Chat-Merge
17
+ **--This is a 22b frankenmerge of [qwen1.5-14B-Chat](https://huggingface.co/Qwen/Qwen1.5-14B-Chat) created by interleaving layers of [qwen1.5-14B-Chat](https://huggingface.co/Qwen/Qwen1.5-14B-Chat) with itself using [mergekit](https://github.com/arcee-ai/mergekit).--**
18
+
19
+ **Due to the current absence of intermediary-sized models between 14B and 72B in the Qwen1.5 series, I am trying to make some middle-sized models, such as those with 20B+ and 30B+ parameters, through a merging approach. This initiative aims to enable more individual users to maximize the utilization of their hardware capabilities.**
20
+
21
+ **-Quantize**
22
+
23
+ Coming soon...
24
+
25
+ **-Merge Configuration**
26
+
27
+ This yaml below:
28
+ ```yaml
29
+ dtype: float16
30
+ merge_method: passthrough
31
+ slices:
32
+ - sources:
33
+ - layer_range: [0, 5]
34
+ model: Qwen/Qwen1.5-14B-Chat
35
+ - sources:
36
+ - layer_range: [5, 15]
37
+ model: Qwen/Qwen1.5-14B-Chat
38
+ - sources:
39
+ - layer_range: [10, 20]
40
+ model: Qwen/Qwen1.5-14B-Chat
41
+ - sources:
42
+ - layer_range: [15, 25]
43
+ model: Qwen/Qwen1.5-14B-Chat
44
+ - sources:
45
+ - layer_range: [20, 30]
46
+ model: Qwen/Qwen1.5-14B-Chat
47
+ - sources:
48
+ - layer_range: [25, 35]
49
+ model: Qwen/Qwen1.5-14B-Chat
50
+ - sources:
51
+ - layer_range: [30, 40]
52
+ model: Qwen/Qwen1.5-14B-Chat
53
+ ```
54
+ **-Performance**
55
+
56
+ * Tips:I don't have the capability to conduct benchmark tests, nor can I even use it extensively enough, so my test results might not be accurate.
57
+
58
+ It has better performance than the 14B version in most of my own tests (subjective) including comprehension, reasoning and coherence. If you believe in this model's performance, feel free to test it out or offer evaluations. Everyone's tests or evaluations are welcome.