DisOOM commited on
Commit
6d9e136
1 Parent(s): 32232f1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +72 -1
README.md CHANGED
@@ -2,5 +2,76 @@
2
  license: other
3
  license_name: yi-license
4
  license_link: https://huggingface.co/01-ai/Yi-34B/blob/main/LICENSE
 
 
 
 
 
 
 
 
 
5
  library_name: transformers
6
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  license: other
3
  license_name: yi-license
4
  license_link: https://huggingface.co/01-ai/Yi-34B/blob/main/LICENSE
5
+ tags:
6
+ - merge
7
+ - mergekit
8
+ - Yi
9
+ - chat
10
+ - conversational
11
+ language:
12
+ - en
13
+ - chi
14
  library_name: transformers
15
+ ---
16
+ # Qwen1.5-22B-Chat-Merge
17
+ **--This is a 22b frankenmerge of [Yi-34B-200K-RPMerge](https://huggingface.co/brucethemoose/Yi-34B-200K-RPMerge) created by interleaving layers of [Yi-34B-200K-RPMerge](https://huggingface.co/brucethemoose/Yi-34B-200K-RPMerge) with itself using [mergekit](https://github.com/arcee-ai/mergekit).--**
18
+
19
+ **By attempting to merge the yi-34B (RPMerge, which I consider to be a better-performing version), to create a 70B-level Yi, what surprised me was that it didn't seem to exhibit the increased logical confusion and linguistic errors that many models with more than double the original parameter count do. It appeared to just get stronger with the increase in parameters. I also tried several other fine-tuned versions of Yi, and the results were satisfactory.**
20
+
21
+ **-Quantize**
22
+
23
+ GGUF Here:[gguf](https://huggingface.co/DisOOM/Qwen1.5-22B-Chat-Merge-GGUF/tree/main)
24
+
25
+ **-Merge Configuration**
26
+
27
+ This yaml below:
28
+ ```yaml
29
+ dtype: float16
30
+ merge_method: passthrough
31
+ slices:
32
+ - sources:
33
+ - layer_range: [0, 4]
34
+ model: brucethemoose/Yi-34B-200K-RPMerge
35
+ - sources:
36
+ - layer_range: [4, 14]
37
+ model: brucethemoose/Yi-34B-200K-RPMerge
38
+ - sources:
39
+ - layer_range: [8, 18]
40
+ model: brucethemoose/Yi-34B-200K-RPMerge
41
+ - sources:
42
+ - layer_range: [12, 22]
43
+ model: brucethemoose/Yi-34B-200K-RPMerge
44
+ - sources:
45
+ - layer_range: [16, 26]
46
+ model: brucethemoose/Yi-34B-200K-RPMerge
47
+ - sources:
48
+ - layer_range: [20, 30]
49
+ model: brucethemoose/Yi-34B-200K-RPMerge
50
+ - sources:
51
+ - layer_range: [24, 34]
52
+ model: brucethemoose/Yi-34B-200K-RPMerge
53
+ - sources:
54
+ - layer_range: [28, 38]
55
+ model: brucethemoose/Yi-34B-200K-RPMerge
56
+ - sources:
57
+ - layer_range: [32, 42]
58
+ model: brucethemoose/Yi-34B-200K-RPMerge
59
+ - sources:
60
+ - layer_range: [36, 46]
61
+ model: brucethemoose/Yi-34B-200K-RPMerge
62
+ - sources:
63
+ - layer_range: [40, 50]
64
+ model: brucethemoose/Yi-34B-200K-RPMerge
65
+ - sources:
66
+ - layer_range: [44, 54]
67
+ model: brucethemoose/Yi-34B-200K-RPMerge
68
+ - sources:
69
+ - layer_range: [48, 60]
70
+ model: brucethemoose/Yi-34B-200K-RPMerge
71
+
72
+ ```
73
+ **-Performance**
74
+
75
+ * Tips:I don't have the capability to conduct benchmark tests, nor can I even use it extensively enough, so my test results might not be accurate.
76
+
77
+ It has better performance than the 34B version in most of my own tests (subjective) including comprehension, reasoning and coherence and also writing skills. If you believe in this model's performance, feel free to test it out or offer evaluations. Everyone's tests or evaluations are welcome.