llmixer commited on
Commit
ec5c001
1 Parent(s): a1f70cd

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +74 -1
README.md CHANGED
@@ -14,4 +14,77 @@ tags:
14
  The BigWeave models aim to experimentally identify merge settings for increasing model performance. The version number merely tracks various attempts and is not a quality indicator. Only results demonstrating good performance are retained and shared.
15
 
16
  # Prompting Format
17
- Mistral, Vicuna and Alpaca.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  The BigWeave models aim to experimentally identify merge settings for increasing model performance. The version number merely tracks various attempts and is not a quality indicator. Only results demonstrating good performance are retained and shared.
15
 
16
  # Prompting Format
17
+ Mistral, Vicuna and Alpaca.
18
+
19
+ # Merge process
20
+ This is a self-merge of 152334H/miqu-1-70b-sf. By conducting exl2 measurements, we identify the most relevant layers. The layers are duplicated such that each group consists of consecutive layers with a two-layer overlap (i.e. larger groups than in v15).
21
+
22
+ Merge configuration:
23
+ ```
24
+ slices:
25
+ - sources:
26
+ - model: 152334H/miqu-1-70b-sf
27
+ layer_range: [0,11]
28
+ - sources:
29
+ - model: 152334H/miqu-1-70b-sf
30
+ layer_range: [9,13]
31
+ - sources:
32
+ - model: 152334H/miqu-1-70b-sf
33
+ layer_range: [11,15]
34
+ - sources:
35
+ - model: 152334H/miqu-1-70b-sf
36
+ layer_range: [13,17]
37
+ - sources:
38
+ - model: 152334H/miqu-1-70b-sf
39
+ layer_range: [15,23]
40
+ - sources:
41
+ - model: 152334H/miqu-1-70b-sf
42
+ layer_range: [21,25]
43
+ - sources:
44
+ - model: 152334H/miqu-1-70b-sf
45
+ layer_range: [23,49]
46
+ - sources:
47
+ - model: 152334H/miqu-1-70b-sf
48
+ layer_range: [47,51]
49
+ - sources:
50
+ - model: 152334H/miqu-1-70b-sf
51
+ layer_range: [49,53]
52
+ - sources:
53
+ - model: 152334H/miqu-1-70b-sf
54
+ layer_range: [51,55]
55
+ - sources:
56
+ - model: 152334H/miqu-1-70b-sf
57
+ layer_range: [53,57]
58
+ - sources:
59
+ - model: 152334H/miqu-1-70b-sf
60
+ layer_range: [55,59]
61
+ - sources:
62
+ - model: 152334H/miqu-1-70b-sf
63
+ layer_range: [57,61]
64
+ - sources:
65
+ - model: 152334H/miqu-1-70b-sf
66
+ layer_range: [59,63]
67
+ - sources:
68
+ - model: 152334H/miqu-1-70b-sf
69
+ layer_range: [61,65]
70
+ - sources:
71
+ - model: 152334H/miqu-1-70b-sf
72
+ layer_range: [63,67]
73
+ - sources:
74
+ - model: 152334H/miqu-1-70b-sf
75
+ layer_range: [65,69]
76
+ - sources:
77
+ - model: 152334H/miqu-1-70b-sf
78
+ layer_range: [67,71]
79
+ - sources:
80
+ - model: 152334H/miqu-1-70b-sf
81
+ layer_range: [69,73]
82
+ - sources:
83
+ - model: 152334H/miqu-1-70b-sf
84
+ layer_range: [71,75]
85
+ - sources:
86
+ - model: 152334H/miqu-1-70b-sf
87
+ layer_range: [73,80]
88
+ merge_method: passthrough
89
+ dtype: float16
90
+ ```