sthenno commited on
Commit
c7e00f9
·
verified ·
1 Parent(s): c707d0c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +109 -1
README.md CHANGED
@@ -14,4 +14,112 @@ tags:
14
  - RLHF
15
  - PPO
16
  - custom-research
17
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
  - RLHF
15
  - PPO
16
  - custom-research
17
+ ---
18
+ # tempesthenno--nuslerp (BASE MODEL)
19
+
20
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
21
+
22
+ ## Merge Details
23
+ ### Merge Method
24
+
25
+ This model was merged using the NuSLERP merge method.
26
+
27
+ ### Models Merged
28
+
29
+ The following models were included in the merge:
30
+ * /Users/sthenno/models/tempesthenno--converge-dtask
31
+ * /Users/sthenno/models/tempesthenno--converge-breadcrumbs
32
+
33
+ ### Configuration
34
+
35
+ The following YAML configuration was used to produce this model:
36
+
37
+ ```yaml
38
+ name: tempesthenno--nuslerp
39
+ merge_method: nuslerp
40
+ tokenizer:
41
+ source: /Users/sthenno/models/tempesthenno--converge-dtask
42
+ chat_template: "chatml"
43
+ dtype: float32
44
+ out_dtype: bfloat16
45
+ parameters:
46
+ int8_mask: false
47
+ normalize: true
48
+ rescale: false
49
+ slices:
50
+ - sources:
51
+ - model: /Users/sthenno/models/tempesthenno--converge-dtask
52
+ layer_range: [0, 8]
53
+ parameters:
54
+ weight: 0.65
55
+ nuslerp_flatten: false
56
+ nuslerp_row_wise: true
57
+ - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
58
+ layer_range: [0, 8]
59
+ parameters:
60
+ weight: 0.35
61
+ nuslerp_flatten: false
62
+ nuslerp_row_wise: true
63
+ - sources:
64
+ - model: /Users/sthenno/models/tempesthenno--converge-dtask
65
+ layer_range: [8, 16]
66
+ parameters:
67
+ weight: 0.60
68
+ nuslerp_flatten: false
69
+ nuslerp_row_wise: true
70
+ - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
71
+ layer_range: [8, 16]
72
+ parameters:
73
+ weight: 0.40
74
+ nuslerp_flatten: false
75
+ nuslerp_row_wise: true
76
+ - sources:
77
+ - model: /Users/sthenno/models/tempesthenno--converge-dtask
78
+ layer_range: [16, 24]
79
+ parameters:
80
+ weight: 0.55
81
+ nuslerp_flatten: false
82
+ nuslerp_row_wise: false
83
+ - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
84
+ layer_range: [16, 24]
85
+ parameters:
86
+ weight: 0.45
87
+ nuslerp_flatten: false
88
+ nuslerp_row_wise: false
89
+ - sources:
90
+ - model: /Users/sthenno/models/tempesthenno--converge-dtask
91
+ layer_range: [24, 32]
92
+ parameters:
93
+ weight: 0.50
94
+ nuslerp_flatten: false
95
+ nuslerp_row_wise: false
96
+ - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
97
+ layer_range: [24, 32]
98
+ parameters:
99
+ weight: 0.50
100
+ nuslerp_flatten: false
101
+ nuslerp_row_wise: false
102
+ - sources:
103
+ - model: /Users/sthenno/models/tempesthenno--converge-dtask
104
+ layer_range: [32, 40]
105
+ parameters:
106
+ weight: 0.45
107
+ nuslerp_flatten: true
108
+ - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
109
+ layer_range: [32, 40]
110
+ parameters:
111
+ weight: 0.55
112
+ nuslerp_flatten: true
113
+ - sources:
114
+ - model: /Users/sthenno/models/tempesthenno--converge-dtask
115
+ layer_range: [40, 48]
116
+ parameters:
117
+ weight: 0.40
118
+ nuslerp_flatten: true
119
+ - model: /Users/sthenno/models/tempesthenno--converge-breadcrumbs
120
+ layer_range: [40, 48]
121
+ parameters:
122
+ weight: 0.60
123
+ nuslerp_flatten: true
124
+
125
+ ```