Solshine commited on
Commit
11ea481
1 Parent(s): 60593a2

Upload folder using huggingface_hub

Browse files
This view is limited to 50 files because it contains too many changes.   See raw diff
Files changed (50) hide show
  1. README.md +505 -0
  2. added_tokens.json +24 -0
  3. config.json +29 -0
  4. mergekit_config.yml +474 -0
  5. merges.txt +0 -0
  6. model-00001-of-00059.safetensors +3 -0
  7. model-00002-of-00059.safetensors +3 -0
  8. model-00003-of-00059.safetensors +3 -0
  9. model-00004-of-00059.safetensors +3 -0
  10. model-00005-of-00059.safetensors +3 -0
  11. model-00006-of-00059.safetensors +3 -0
  12. model-00007-of-00059.safetensors +3 -0
  13. model-00008-of-00059.safetensors +3 -0
  14. model-00009-of-00059.safetensors +3 -0
  15. model-00010-of-00059.safetensors +3 -0
  16. model-00011-of-00059.safetensors +3 -0
  17. model-00012-of-00059.safetensors +3 -0
  18. model-00013-of-00059.safetensors +3 -0
  19. model-00014-of-00059.safetensors +3 -0
  20. model-00015-of-00059.safetensors +3 -0
  21. model-00016-of-00059.safetensors +3 -0
  22. model-00017-of-00059.safetensors +3 -0
  23. model-00018-of-00059.safetensors +3 -0
  24. model-00019-of-00059.safetensors +3 -0
  25. model-00020-of-00059.safetensors +3 -0
  26. model-00021-of-00059.safetensors +3 -0
  27. model-00022-of-00059.safetensors +3 -0
  28. model-00023-of-00059.safetensors +3 -0
  29. model-00024-of-00059.safetensors +3 -0
  30. model-00025-of-00059.safetensors +3 -0
  31. model-00026-of-00059.safetensors +3 -0
  32. model-00027-of-00059.safetensors +3 -0
  33. model-00028-of-00059.safetensors +3 -0
  34. model-00029-of-00059.safetensors +3 -0
  35. model-00030-of-00059.safetensors +3 -0
  36. model-00031-of-00059.safetensors +3 -0
  37. model-00032-of-00059.safetensors +3 -0
  38. model-00033-of-00059.safetensors +3 -0
  39. model-00034-of-00059.safetensors +3 -0
  40. model-00035-of-00059.safetensors +3 -0
  41. model-00036-of-00059.safetensors +3 -0
  42. model-00037-of-00059.safetensors +3 -0
  43. model-00038-of-00059.safetensors +3 -0
  44. model-00039-of-00059.safetensors +3 -0
  45. model-00040-of-00059.safetensors +3 -0
  46. model-00041-of-00059.safetensors +3 -0
  47. model-00042-of-00059.safetensors +3 -0
  48. model-00043-of-00059.safetensors +3 -0
  49. model-00044-of-00059.safetensors +3 -0
  50. model-00045-of-00059.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,505 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Qwen/Qwen2.5-Math-72B
4
+ - Qwen/Qwen2.5-72B-Instruct
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+
10
+ ---
11
+ # merge
12
+
13
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
+
15
+ ## Merge Details
16
+ ### Merge Method
17
+
18
+ This model was merged using the passthrough merge method.
19
+
20
+ ### Models Merged
21
+
22
+ The following models were included in the merge:
23
+ * [Qwen/Qwen2.5-Math-72B](https://huggingface.co/Qwen/Qwen2.5-Math-72B)
24
+ * [Qwen/Qwen2.5-72B-Instruct](https://huggingface.co/Qwen/Qwen2.5-72B-Instruct)
25
+
26
+ ### Configuration
27
+
28
+ The following YAML configuration was used to produce this model:
29
+
30
+ ```yaml
31
+ slices:
32
+ - sources:
33
+ - model: Qwen/Qwen2.5-Math-72B
34
+ layer_range: [0, 1]
35
+ - sources:
36
+ - model: Qwen/Qwen2.5-72B-Instruct
37
+ layer_range: [0, 1]
38
+ - sources:
39
+ - model: Qwen/Qwen2.5-Math-72B
40
+ layer_range: [1, 2]
41
+ - sources:
42
+ - model: Qwen/Qwen2.5-72B-Instruct
43
+ layer_range: [1, 2]
44
+ - sources:
45
+ - model: Qwen/Qwen2.5-Math-72B
46
+ layer_range: [2, 3]
47
+ - sources:
48
+ - model: Qwen/Qwen2.5-72B-Instruct
49
+ layer_range: [2, 3]
50
+ - sources:
51
+ - model: Qwen/Qwen2.5-Math-72B
52
+ layer_range: [3, 4]
53
+ - sources:
54
+ - model: Qwen/Qwen2.5-72B-Instruct
55
+ layer_range: [3, 4]
56
+ - sources:
57
+ - model: Qwen/Qwen2.5-Math-72B
58
+ layer_range: [4, 5]
59
+ - sources:
60
+ - model: Qwen/Qwen2.5-72B-Instruct
61
+ layer_range: [4, 5]
62
+ - sources:
63
+ - model: Qwen/Qwen2.5-Math-72B
64
+ layer_range: [5, 6]
65
+ - sources:
66
+ - model: Qwen/Qwen2.5-72B-Instruct
67
+ layer_range: [5, 6]
68
+ - sources:
69
+ - model: Qwen/Qwen2.5-Math-72B
70
+ layer_range: [6, 7]
71
+ - sources:
72
+ - model: Qwen/Qwen2.5-72B-Instruct
73
+ layer_range: [6, 7]
74
+ - sources:
75
+ - model: Qwen/Qwen2.5-Math-72B
76
+ layer_range: [7, 8]
77
+ - sources:
78
+ - model: Qwen/Qwen2.5-72B-Instruct
79
+ layer_range: [7, 8]
80
+ - sources:
81
+ - model: Qwen/Qwen2.5-Math-72B
82
+ layer_range: [8, 9]
83
+ - sources:
84
+ - model: Qwen/Qwen2.5-72B-Instruct
85
+ layer_range: [8, 9]
86
+ - sources:
87
+ - model: Qwen/Qwen2.5-Math-72B
88
+ layer_range: [9, 10]
89
+ - sources:
90
+ - model: Qwen/Qwen2.5-72B-Instruct
91
+ layer_range: [9, 10]
92
+ - sources:
93
+ - model: Qwen/Qwen2.5-Math-72B
94
+ layer_range: [10, 11]
95
+ - sources:
96
+ - model: Qwen/Qwen2.5-72B-Instruct
97
+ layer_range: [10, 11]
98
+ - sources:
99
+ - model: Qwen/Qwen2.5-Math-72B
100
+ layer_range: [11, 12]
101
+ - sources:
102
+ - model: Qwen/Qwen2.5-72B-Instruct
103
+ layer_range: [11, 12]
104
+ - sources:
105
+ - model: Qwen/Qwen2.5-Math-72B
106
+ layer_range: [12, 13]
107
+ - sources:
108
+ - model: Qwen/Qwen2.5-72B-Instruct
109
+ layer_range: [12, 13]
110
+ - sources:
111
+ - model: Qwen/Qwen2.5-Math-72B
112
+ layer_range: [13, 14]
113
+ - sources:
114
+ - model: Qwen/Qwen2.5-72B-Instruct
115
+ layer_range: [13, 14]
116
+ - sources:
117
+ - model: Qwen/Qwen2.5-Math-72B
118
+ layer_range: [14, 15]
119
+ - sources:
120
+ - model: Qwen/Qwen2.5-72B-Instruct
121
+ layer_range: [14, 15]
122
+ - sources:
123
+ - model: Qwen/Qwen2.5-Math-72B
124
+ layer_range: [15, 16]
125
+ - sources:
126
+ - model: Qwen/Qwen2.5-72B-Instruct
127
+ layer_range: [15, 16]
128
+ - sources:
129
+ - model: Qwen/Qwen2.5-Math-72B
130
+ layer_range: [16, 17]
131
+ - sources:
132
+ - model: Qwen/Qwen2.5-72B-Instruct
133
+ layer_range: [16, 17]
134
+ - sources:
135
+ - model: Qwen/Qwen2.5-Math-72B
136
+ layer_range: [17, 18]
137
+ - sources:
138
+ - model: Qwen/Qwen2.5-72B-Instruct
139
+ layer_range: [17, 18]
140
+ - sources:
141
+ - model: Qwen/Qwen2.5-Math-72B
142
+ layer_range: [18, 19]
143
+ - sources:
144
+ - model: Qwen/Qwen2.5-72B-Instruct
145
+ layer_range: [18, 19]
146
+ - sources:
147
+ - model: Qwen/Qwen2.5-Math-72B
148
+ layer_range: [19, 20]
149
+ - sources:
150
+ - model: Qwen/Qwen2.5-72B-Instruct
151
+ layer_range: [19, 20]
152
+ - sources:
153
+ - model: Qwen/Qwen2.5-Math-72B
154
+ layer_range: [20, 21]
155
+ - sources:
156
+ - model: Qwen/Qwen2.5-72B-Instruct
157
+ layer_range: [20, 21]
158
+ - sources:
159
+ - model: Qwen/Qwen2.5-Math-72B
160
+ layer_range: [21, 22]
161
+ - sources:
162
+ - model: Qwen/Qwen2.5-72B-Instruct
163
+ layer_range: [21, 22]
164
+ - sources:
165
+ - model: Qwen/Qwen2.5-Math-72B
166
+ layer_range: [22, 23]
167
+ - sources:
168
+ - model: Qwen/Qwen2.5-72B-Instruct
169
+ layer_range: [22, 23]
170
+ - sources:
171
+ - model: Qwen/Qwen2.5-Math-72B
172
+ layer_range: [23, 24]
173
+ - sources:
174
+ - model: Qwen/Qwen2.5-72B-Instruct
175
+ layer_range: [23, 24]
176
+ - sources:
177
+ - model: Qwen/Qwen2.5-Math-72B
178
+ layer_range: [24, 25]
179
+ - sources:
180
+ - model: Qwen/Qwen2.5-72B-Instruct
181
+ layer_range: [24, 25]
182
+ - sources:
183
+ - model: Qwen/Qwen2.5-Math-72B
184
+ layer_range: [25, 26]
185
+ - sources:
186
+ - model: Qwen/Qwen2.5-72B-Instruct
187
+ layer_range: [25, 26]
188
+ - sources:
189
+ - model: Qwen/Qwen2.5-Math-72B
190
+ layer_range: [26, 27]
191
+ - sources:
192
+ - model: Qwen/Qwen2.5-72B-Instruct
193
+ layer_range: [26, 27]
194
+ - sources:
195
+ - model: Qwen/Qwen2.5-Math-72B
196
+ layer_range: [27, 28]
197
+ - sources:
198
+ - model: Qwen/Qwen2.5-72B-Instruct
199
+ layer_range: [27, 28]
200
+ - sources:
201
+ - model: Qwen/Qwen2.5-Math-72B
202
+ layer_range: [28, 29]
203
+ - sources:
204
+ - model: Qwen/Qwen2.5-72B-Instruct
205
+ layer_range: [28, 29]
206
+ - sources:
207
+ - model: Qwen/Qwen2.5-Math-72B
208
+ layer_range: [29, 30]
209
+ - sources:
210
+ - model: Qwen/Qwen2.5-72B-Instruct
211
+ layer_range: [29, 30]
212
+ - sources:
213
+ - model: Qwen/Qwen2.5-Math-72B
214
+ layer_range: [30, 31]
215
+ - sources:
216
+ - model: Qwen/Qwen2.5-72B-Instruct
217
+ layer_range: [30, 31]
218
+ - sources:
219
+ - model: Qwen/Qwen2.5-Math-72B
220
+ layer_range: [31, 32]
221
+ - sources:
222
+ - model: Qwen/Qwen2.5-72B-Instruct
223
+ layer_range: [31, 32]
224
+ - sources:
225
+ - model: Qwen/Qwen2.5-Math-72B
226
+ layer_range: [32, 33]
227
+ - sources:
228
+ - model: Qwen/Qwen2.5-72B-Instruct
229
+ layer_range: [32, 33]
230
+ - sources:
231
+ - model: Qwen/Qwen2.5-Math-72B
232
+ layer_range: [33, 34]
233
+ - sources:
234
+ - model: Qwen/Qwen2.5-72B-Instruct
235
+ layer_range: [33, 34]
236
+ - sources:
237
+ - model: Qwen/Qwen2.5-Math-72B
238
+ layer_range: [34, 35]
239
+ - sources:
240
+ - model: Qwen/Qwen2.5-72B-Instruct
241
+ layer_range: [34, 35]
242
+ - sources:
243
+ - model: Qwen/Qwen2.5-Math-72B
244
+ layer_range: [35, 36]
245
+ - sources:
246
+ - model: Qwen/Qwen2.5-72B-Instruct
247
+ layer_range: [35, 36]
248
+ - sources:
249
+ - model: Qwen/Qwen2.5-Math-72B
250
+ layer_range: [36, 37]
251
+ - sources:
252
+ - model: Qwen/Qwen2.5-72B-Instruct
253
+ layer_range: [36, 37]
254
+ - sources:
255
+ - model: Qwen/Qwen2.5-Math-72B
256
+ layer_range: [37, 38]
257
+ - sources:
258
+ - model: Qwen/Qwen2.5-72B-Instruct
259
+ layer_range: [37, 38]
260
+ - sources:
261
+ - model: Qwen/Qwen2.5-Math-72B
262
+ layer_range: [38, 39]
263
+ - sources:
264
+ - model: Qwen/Qwen2.5-72B-Instruct
265
+ layer_range: [38, 39]
266
+ - sources:
267
+ - model: Qwen/Qwen2.5-Math-72B
268
+ layer_range: [39, 40]
269
+ - sources:
270
+ - model: Qwen/Qwen2.5-72B-Instruct
271
+ layer_range: [39, 40]
272
+ - sources:
273
+ - model: Qwen/Qwen2.5-Math-72B
274
+ layer_range: [40, 41]
275
+ - sources:
276
+ - model: Qwen/Qwen2.5-72B-Instruct
277
+ layer_range: [40, 41]
278
+ - sources:
279
+ - model: Qwen/Qwen2.5-Math-72B
280
+ layer_range: [41, 42]
281
+ - sources:
282
+ - model: Qwen/Qwen2.5-72B-Instruct
283
+ layer_range: [41, 42]
284
+ - sources:
285
+ - model: Qwen/Qwen2.5-Math-72B
286
+ layer_range: [42, 43]
287
+ - sources:
288
+ - model: Qwen/Qwen2.5-72B-Instruct
289
+ layer_range: [42, 43]
290
+ - sources:
291
+ - model: Qwen/Qwen2.5-Math-72B
292
+ layer_range: [43, 44]
293
+ - sources:
294
+ - model: Qwen/Qwen2.5-72B-Instruct
295
+ layer_range: [43, 44]
296
+ - sources:
297
+ - model: Qwen/Qwen2.5-Math-72B
298
+ layer_range: [44, 45]
299
+ - sources:
300
+ - model: Qwen/Qwen2.5-72B-Instruct
301
+ layer_range: [44, 45]
302
+ - sources:
303
+ - model: Qwen/Qwen2.5-Math-72B
304
+ layer_range: [45, 46]
305
+ - sources:
306
+ - model: Qwen/Qwen2.5-72B-Instruct
307
+ layer_range: [45, 46]
308
+ - sources:
309
+ - model: Qwen/Qwen2.5-Math-72B
310
+ layer_range: [46, 47]
311
+ - sources:
312
+ - model: Qwen/Qwen2.5-72B-Instruct
313
+ layer_range: [46, 47]
314
+ - sources:
315
+ - model: Qwen/Qwen2.5-Math-72B
316
+ layer_range: [47, 48]
317
+ - sources:
318
+ - model: Qwen/Qwen2.5-72B-Instruct
319
+ layer_range: [47, 48]
320
+ - sources:
321
+ - model: Qwen/Qwen2.5-Math-72B
322
+ layer_range: [48, 49]
323
+ - sources:
324
+ - model: Qwen/Qwen2.5-72B-Instruct
325
+ layer_range: [48, 49]
326
+ - sources:
327
+ - model: Qwen/Qwen2.5-Math-72B
328
+ layer_range: [49, 50]
329
+ - sources:
330
+ - model: Qwen/Qwen2.5-72B-Instruct
331
+ layer_range: [49, 50]
332
+ - sources:
333
+ - model: Qwen/Qwen2.5-Math-72B
334
+ layer_range: [50, 51]
335
+ - sources:
336
+ - model: Qwen/Qwen2.5-72B-Instruct
337
+ layer_range: [50, 51]
338
+ - sources:
339
+ - model: Qwen/Qwen2.5-Math-72B
340
+ layer_range: [51, 52]
341
+ - sources:
342
+ - model: Qwen/Qwen2.5-72B-Instruct
343
+ layer_range: [51, 52]
344
+ - sources:
345
+ - model: Qwen/Qwen2.5-Math-72B
346
+ layer_range: [52, 53]
347
+ - sources:
348
+ - model: Qwen/Qwen2.5-72B-Instruct
349
+ layer_range: [52, 53]
350
+ - sources:
351
+ - model: Qwen/Qwen2.5-Math-72B
352
+ layer_range: [53, 54]
353
+ - sources:
354
+ - model: Qwen/Qwen2.5-72B-Instruct
355
+ layer_range: [53, 54]
356
+ - sources:
357
+ - model: Qwen/Qwen2.5-Math-72B
358
+ layer_range: [54, 55]
359
+ - sources:
360
+ - model: Qwen/Qwen2.5-72B-Instruct
361
+ layer_range: [54, 55]
362
+ - sources:
363
+ - model: Qwen/Qwen2.5-Math-72B
364
+ layer_range: [55, 56]
365
+ - sources:
366
+ - model: Qwen/Qwen2.5-72B-Instruct
367
+ layer_range: [55, 56]
368
+ - sources:
369
+ - model: Qwen/Qwen2.5-Math-72B
370
+ layer_range: [56, 57]
371
+ - sources:
372
+ - model: Qwen/Qwen2.5-72B-Instruct
373
+ layer_range: [56, 57]
374
+ - sources:
375
+ - model: Qwen/Qwen2.5-Math-72B
376
+ layer_range: [57, 58]
377
+ - sources:
378
+ - model: Qwen/Qwen2.5-72B-Instruct
379
+ layer_range: [57, 58]
380
+ - sources:
381
+ - model: Qwen/Qwen2.5-Math-72B
382
+ layer_range: [58, 59]
383
+ - sources:
384
+ - model: Qwen/Qwen2.5-72B-Instruct
385
+ layer_range: [58, 59]
386
+ - sources:
387
+ - model: Qwen/Qwen2.5-Math-72B
388
+ layer_range: [59, 60]
389
+ - sources:
390
+ - model: Qwen/Qwen2.5-72B-Instruct
391
+ layer_range: [59, 60]
392
+ - sources:
393
+ - model: Qwen/Qwen2.5-Math-72B
394
+ layer_range: [60, 61]
395
+ - sources:
396
+ - model: Qwen/Qwen2.5-72B-Instruct
397
+ layer_range: [60, 61]
398
+ - sources:
399
+ - model: Qwen/Qwen2.5-Math-72B
400
+ layer_range: [61, 62]
401
+ - sources:
402
+ - model: Qwen/Qwen2.5-72B-Instruct
403
+ layer_range: [61, 62]
404
+ - sources:
405
+ - model: Qwen/Qwen2.5-Math-72B
406
+ layer_range: [62, 63]
407
+ - sources:
408
+ - model: Qwen/Qwen2.5-72B-Instruct
409
+ layer_range: [62, 63]
410
+ - sources:
411
+ - model: Qwen/Qwen2.5-Math-72B
412
+ layer_range: [63, 64]
413
+ - sources:
414
+ - model: Qwen/Qwen2.5-72B-Instruct
415
+ layer_range: [63, 64]
416
+ - sources:
417
+ - model: Qwen/Qwen2.5-Math-72B
418
+ layer_range: [64, 65]
419
+ - sources:
420
+ - model: Qwen/Qwen2.5-72B-Instruct
421
+ layer_range: [64, 65]
422
+ - sources:
423
+ - model: Qwen/Qwen2.5-Math-72B
424
+ layer_range: [65, 66]
425
+ - sources:
426
+ - model: Qwen/Qwen2.5-72B-Instruct
427
+ layer_range: [65, 66]
428
+ - sources:
429
+ - model: Qwen/Qwen2.5-Math-72B
430
+ layer_range: [66, 67]
431
+ - sources:
432
+ - model: Qwen/Qwen2.5-72B-Instruct
433
+ layer_range: [66, 67]
434
+ - sources:
435
+ - model: Qwen/Qwen2.5-Math-72B
436
+ layer_range: [67, 68]
437
+ - sources:
438
+ - model: Qwen/Qwen2.5-72B-Instruct
439
+ layer_range: [67, 68]
440
+ - sources:
441
+ - model: Qwen/Qwen2.5-Math-72B
442
+ layer_range: [68, 69]
443
+ - sources:
444
+ - model: Qwen/Qwen2.5-72B-Instruct
445
+ layer_range: [68, 69]
446
+ - sources:
447
+ - model: Qwen/Qwen2.5-Math-72B
448
+ layer_range: [69, 70]
449
+ - sources:
450
+ - model: Qwen/Qwen2.5-72B-Instruct
451
+ layer_range: [69, 70]
452
+ - sources:
453
+ - model: Qwen/Qwen2.5-Math-72B
454
+ layer_range: [70, 71]
455
+ - sources:
456
+ - model: Qwen/Qwen2.5-72B-Instruct
457
+ layer_range: [70, 71]
458
+ - sources:
459
+ - model: Qwen/Qwen2.5-Math-72B
460
+ layer_range: [71, 72]
461
+ - sources:
462
+ - model: Qwen/Qwen2.5-72B-Instruct
463
+ layer_range: [71, 72]
464
+ - sources:
465
+ - model: Qwen/Qwen2.5-Math-72B
466
+ layer_range: [72, 73]
467
+ - sources:
468
+ - model: Qwen/Qwen2.5-72B-Instruct
469
+ layer_range: [72, 73]
470
+ - sources:
471
+ - model: Qwen/Qwen2.5-Math-72B
472
+ layer_range: [73, 74]
473
+ - sources:
474
+ - model: Qwen/Qwen2.5-72B-Instruct
475
+ layer_range: [73, 74]
476
+ - sources:
477
+ - model: Qwen/Qwen2.5-Math-72B
478
+ layer_range: [74, 75]
479
+ - sources:
480
+ - model: Qwen/Qwen2.5-72B-Instruct
481
+ layer_range: [74, 75]
482
+ - sources:
483
+ - model: Qwen/Qwen2.5-Math-72B
484
+ layer_range: [75, 76]
485
+ - sources:
486
+ - model: Qwen/Qwen2.5-72B-Instruct
487
+ layer_range: [75, 76]
488
+ - sources:
489
+ - model: Qwen/Qwen2.5-Math-72B
490
+ layer_range: [76, 77]
491
+ - sources:
492
+ - model: Qwen/Qwen2.5-72B-Instruct
493
+ layer_range: [76, 77]
494
+ - sources:
495
+ - model: Qwen/Qwen2.5-Math-72B
496
+ layer_range: [77, 78]
497
+ - sources:
498
+ - model: Qwen/Qwen2.5-72B-Instruct
499
+ layer_range: [77, 78]
500
+ - sources:
501
+ - model: Qwen/Qwen2.5-72B-Instruct
502
+ layer_range: [77, 80]
503
+ merge_method: passthrough
504
+ dtype: float16
505
+ ```
added_tokens.json ADDED
@@ -0,0 +1,24 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</tool_call>": 151658,
3
+ "<tool_call>": 151657,
4
+ "<|box_end|>": 151649,
5
+ "<|box_start|>": 151648,
6
+ "<|endoftext|>": 151643,
7
+ "<|file_sep|>": 151664,
8
+ "<|fim_middle|>": 151660,
9
+ "<|fim_pad|>": 151662,
10
+ "<|fim_prefix|>": 151659,
11
+ "<|fim_suffix|>": 151661,
12
+ "<|im_end|>": 151645,
13
+ "<|im_start|>": 151644,
14
+ "<|image_pad|>": 151655,
15
+ "<|object_ref_end|>": 151647,
16
+ "<|object_ref_start|>": 151646,
17
+ "<|quad_end|>": 151651,
18
+ "<|quad_start|>": 151650,
19
+ "<|repo_name|>": 151663,
20
+ "<|video_pad|>": 151656,
21
+ "<|vision_end|>": 151653,
22
+ "<|vision_pad|>": 151654,
23
+ "<|vision_start|>": 151652
24
+ }
config.json ADDED
@@ -0,0 +1,29 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_name_or_path": "Qwen/Qwen2.5-Math-72B",
3
+ "architectures": [
4
+ "Qwen2ForCausalLM"
5
+ ],
6
+ "attention_dropout": 0.0,
7
+ "bos_token_id": 151643,
8
+ "eos_token_id": 151643,
9
+ "hidden_act": "silu",
10
+ "hidden_size": 8192,
11
+ "initializer_range": 0.02,
12
+ "intermediate_size": 29568,
13
+ "max_position_embeddings": 4096,
14
+ "max_window_layers": 28,
15
+ "model_type": "qwen2",
16
+ "num_attention_heads": 64,
17
+ "num_hidden_layers": 159,
18
+ "num_key_value_heads": 8,
19
+ "rms_norm_eps": 1e-05,
20
+ "rope_theta": 10000,
21
+ "sliding_window": null,
22
+ "tie_word_embeddings": false,
23
+ "torch_dtype": "float16",
24
+ "transformers_version": "4.44.1",
25
+ "use_cache": true,
26
+ "use_mrope": false,
27
+ "use_sliding_window": false,
28
+ "vocab_size": 152064
29
+ }
mergekit_config.yml ADDED
@@ -0,0 +1,474 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ slices:
2
+ - sources:
3
+ - model: Qwen/Qwen2.5-Math-72B
4
+ layer_range: [0, 1]
5
+ - sources:
6
+ - model: Qwen/Qwen2.5-72B-Instruct
7
+ layer_range: [0, 1]
8
+ - sources:
9
+ - model: Qwen/Qwen2.5-Math-72B
10
+ layer_range: [1, 2]
11
+ - sources:
12
+ - model: Qwen/Qwen2.5-72B-Instruct
13
+ layer_range: [1, 2]
14
+ - sources:
15
+ - model: Qwen/Qwen2.5-Math-72B
16
+ layer_range: [2, 3]
17
+ - sources:
18
+ - model: Qwen/Qwen2.5-72B-Instruct
19
+ layer_range: [2, 3]
20
+ - sources:
21
+ - model: Qwen/Qwen2.5-Math-72B
22
+ layer_range: [3, 4]
23
+ - sources:
24
+ - model: Qwen/Qwen2.5-72B-Instruct
25
+ layer_range: [3, 4]
26
+ - sources:
27
+ - model: Qwen/Qwen2.5-Math-72B
28
+ layer_range: [4, 5]
29
+ - sources:
30
+ - model: Qwen/Qwen2.5-72B-Instruct
31
+ layer_range: [4, 5]
32
+ - sources:
33
+ - model: Qwen/Qwen2.5-Math-72B
34
+ layer_range: [5, 6]
35
+ - sources:
36
+ - model: Qwen/Qwen2.5-72B-Instruct
37
+ layer_range: [5, 6]
38
+ - sources:
39
+ - model: Qwen/Qwen2.5-Math-72B
40
+ layer_range: [6, 7]
41
+ - sources:
42
+ - model: Qwen/Qwen2.5-72B-Instruct
43
+ layer_range: [6, 7]
44
+ - sources:
45
+ - model: Qwen/Qwen2.5-Math-72B
46
+ layer_range: [7, 8]
47
+ - sources:
48
+ - model: Qwen/Qwen2.5-72B-Instruct
49
+ layer_range: [7, 8]
50
+ - sources:
51
+ - model: Qwen/Qwen2.5-Math-72B
52
+ layer_range: [8, 9]
53
+ - sources:
54
+ - model: Qwen/Qwen2.5-72B-Instruct
55
+ layer_range: [8, 9]
56
+ - sources:
57
+ - model: Qwen/Qwen2.5-Math-72B
58
+ layer_range: [9, 10]
59
+ - sources:
60
+ - model: Qwen/Qwen2.5-72B-Instruct
61
+ layer_range: [9, 10]
62
+ - sources:
63
+ - model: Qwen/Qwen2.5-Math-72B
64
+ layer_range: [10, 11]
65
+ - sources:
66
+ - model: Qwen/Qwen2.5-72B-Instruct
67
+ layer_range: [10, 11]
68
+ - sources:
69
+ - model: Qwen/Qwen2.5-Math-72B
70
+ layer_range: [11, 12]
71
+ - sources:
72
+ - model: Qwen/Qwen2.5-72B-Instruct
73
+ layer_range: [11, 12]
74
+ - sources:
75
+ - model: Qwen/Qwen2.5-Math-72B
76
+ layer_range: [12, 13]
77
+ - sources:
78
+ - model: Qwen/Qwen2.5-72B-Instruct
79
+ layer_range: [12, 13]
80
+ - sources:
81
+ - model: Qwen/Qwen2.5-Math-72B
82
+ layer_range: [13, 14]
83
+ - sources:
84
+ - model: Qwen/Qwen2.5-72B-Instruct
85
+ layer_range: [13, 14]
86
+ - sources:
87
+ - model: Qwen/Qwen2.5-Math-72B
88
+ layer_range: [14, 15]
89
+ - sources:
90
+ - model: Qwen/Qwen2.5-72B-Instruct
91
+ layer_range: [14, 15]
92
+ - sources:
93
+ - model: Qwen/Qwen2.5-Math-72B
94
+ layer_range: [15, 16]
95
+ - sources:
96
+ - model: Qwen/Qwen2.5-72B-Instruct
97
+ layer_range: [15, 16]
98
+ - sources:
99
+ - model: Qwen/Qwen2.5-Math-72B
100
+ layer_range: [16, 17]
101
+ - sources:
102
+ - model: Qwen/Qwen2.5-72B-Instruct
103
+ layer_range: [16, 17]
104
+ - sources:
105
+ - model: Qwen/Qwen2.5-Math-72B
106
+ layer_range: [17, 18]
107
+ - sources:
108
+ - model: Qwen/Qwen2.5-72B-Instruct
109
+ layer_range: [17, 18]
110
+ - sources:
111
+ - model: Qwen/Qwen2.5-Math-72B
112
+ layer_range: [18, 19]
113
+ - sources:
114
+ - model: Qwen/Qwen2.5-72B-Instruct
115
+ layer_range: [18, 19]
116
+ - sources:
117
+ - model: Qwen/Qwen2.5-Math-72B
118
+ layer_range: [19, 20]
119
+ - sources:
120
+ - model: Qwen/Qwen2.5-72B-Instruct
121
+ layer_range: [19, 20]
122
+ - sources:
123
+ - model: Qwen/Qwen2.5-Math-72B
124
+ layer_range: [20, 21]
125
+ - sources:
126
+ - model: Qwen/Qwen2.5-72B-Instruct
127
+ layer_range: [20, 21]
128
+ - sources:
129
+ - model: Qwen/Qwen2.5-Math-72B
130
+ layer_range: [21, 22]
131
+ - sources:
132
+ - model: Qwen/Qwen2.5-72B-Instruct
133
+ layer_range: [21, 22]
134
+ - sources:
135
+ - model: Qwen/Qwen2.5-Math-72B
136
+ layer_range: [22, 23]
137
+ - sources:
138
+ - model: Qwen/Qwen2.5-72B-Instruct
139
+ layer_range: [22, 23]
140
+ - sources:
141
+ - model: Qwen/Qwen2.5-Math-72B
142
+ layer_range: [23, 24]
143
+ - sources:
144
+ - model: Qwen/Qwen2.5-72B-Instruct
145
+ layer_range: [23, 24]
146
+ - sources:
147
+ - model: Qwen/Qwen2.5-Math-72B
148
+ layer_range: [24, 25]
149
+ - sources:
150
+ - model: Qwen/Qwen2.5-72B-Instruct
151
+ layer_range: [24, 25]
152
+ - sources:
153
+ - model: Qwen/Qwen2.5-Math-72B
154
+ layer_range: [25, 26]
155
+ - sources:
156
+ - model: Qwen/Qwen2.5-72B-Instruct
157
+ layer_range: [25, 26]
158
+ - sources:
159
+ - model: Qwen/Qwen2.5-Math-72B
160
+ layer_range: [26, 27]
161
+ - sources:
162
+ - model: Qwen/Qwen2.5-72B-Instruct
163
+ layer_range: [26, 27]
164
+ - sources:
165
+ - model: Qwen/Qwen2.5-Math-72B
166
+ layer_range: [27, 28]
167
+ - sources:
168
+ - model: Qwen/Qwen2.5-72B-Instruct
169
+ layer_range: [27, 28]
170
+ - sources:
171
+ - model: Qwen/Qwen2.5-Math-72B
172
+ layer_range: [28, 29]
173
+ - sources:
174
+ - model: Qwen/Qwen2.5-72B-Instruct
175
+ layer_range: [28, 29]
176
+ - sources:
177
+ - model: Qwen/Qwen2.5-Math-72B
178
+ layer_range: [29, 30]
179
+ - sources:
180
+ - model: Qwen/Qwen2.5-72B-Instruct
181
+ layer_range: [29, 30]
182
+ - sources:
183
+ - model: Qwen/Qwen2.5-Math-72B
184
+ layer_range: [30, 31]
185
+ - sources:
186
+ - model: Qwen/Qwen2.5-72B-Instruct
187
+ layer_range: [30, 31]
188
+ - sources:
189
+ - model: Qwen/Qwen2.5-Math-72B
190
+ layer_range: [31, 32]
191
+ - sources:
192
+ - model: Qwen/Qwen2.5-72B-Instruct
193
+ layer_range: [31, 32]
194
+ - sources:
195
+ - model: Qwen/Qwen2.5-Math-72B
196
+ layer_range: [32, 33]
197
+ - sources:
198
+ - model: Qwen/Qwen2.5-72B-Instruct
199
+ layer_range: [32, 33]
200
+ - sources:
201
+ - model: Qwen/Qwen2.5-Math-72B
202
+ layer_range: [33, 34]
203
+ - sources:
204
+ - model: Qwen/Qwen2.5-72B-Instruct
205
+ layer_range: [33, 34]
206
+ - sources:
207
+ - model: Qwen/Qwen2.5-Math-72B
208
+ layer_range: [34, 35]
209
+ - sources:
210
+ - model: Qwen/Qwen2.5-72B-Instruct
211
+ layer_range: [34, 35]
212
+ - sources:
213
+ - model: Qwen/Qwen2.5-Math-72B
214
+ layer_range: [35, 36]
215
+ - sources:
216
+ - model: Qwen/Qwen2.5-72B-Instruct
217
+ layer_range: [35, 36]
218
+ - sources:
219
+ - model: Qwen/Qwen2.5-Math-72B
220
+ layer_range: [36, 37]
221
+ - sources:
222
+ - model: Qwen/Qwen2.5-72B-Instruct
223
+ layer_range: [36, 37]
224
+ - sources:
225
+ - model: Qwen/Qwen2.5-Math-72B
226
+ layer_range: [37, 38]
227
+ - sources:
228
+ - model: Qwen/Qwen2.5-72B-Instruct
229
+ layer_range: [37, 38]
230
+ - sources:
231
+ - model: Qwen/Qwen2.5-Math-72B
232
+ layer_range: [38, 39]
233
+ - sources:
234
+ - model: Qwen/Qwen2.5-72B-Instruct
235
+ layer_range: [38, 39]
236
+ - sources:
237
+ - model: Qwen/Qwen2.5-Math-72B
238
+ layer_range: [39, 40]
239
+ - sources:
240
+ - model: Qwen/Qwen2.5-72B-Instruct
241
+ layer_range: [39, 40]
242
+ - sources:
243
+ - model: Qwen/Qwen2.5-Math-72B
244
+ layer_range: [40, 41]
245
+ - sources:
246
+ - model: Qwen/Qwen2.5-72B-Instruct
247
+ layer_range: [40, 41]
248
+ - sources:
249
+ - model: Qwen/Qwen2.5-Math-72B
250
+ layer_range: [41, 42]
251
+ - sources:
252
+ - model: Qwen/Qwen2.5-72B-Instruct
253
+ layer_range: [41, 42]
254
+ - sources:
255
+ - model: Qwen/Qwen2.5-Math-72B
256
+ layer_range: [42, 43]
257
+ - sources:
258
+ - model: Qwen/Qwen2.5-72B-Instruct
259
+ layer_range: [42, 43]
260
+ - sources:
261
+ - model: Qwen/Qwen2.5-Math-72B
262
+ layer_range: [43, 44]
263
+ - sources:
264
+ - model: Qwen/Qwen2.5-72B-Instruct
265
+ layer_range: [43, 44]
266
+ - sources:
267
+ - model: Qwen/Qwen2.5-Math-72B
268
+ layer_range: [44, 45]
269
+ - sources:
270
+ - model: Qwen/Qwen2.5-72B-Instruct
271
+ layer_range: [44, 45]
272
+ - sources:
273
+ - model: Qwen/Qwen2.5-Math-72B
274
+ layer_range: [45, 46]
275
+ - sources:
276
+ - model: Qwen/Qwen2.5-72B-Instruct
277
+ layer_range: [45, 46]
278
+ - sources:
279
+ - model: Qwen/Qwen2.5-Math-72B
280
+ layer_range: [46, 47]
281
+ - sources:
282
+ - model: Qwen/Qwen2.5-72B-Instruct
283
+ layer_range: [46, 47]
284
+ - sources:
285
+ - model: Qwen/Qwen2.5-Math-72B
286
+ layer_range: [47, 48]
287
+ - sources:
288
+ - model: Qwen/Qwen2.5-72B-Instruct
289
+ layer_range: [47, 48]
290
+ - sources:
291
+ - model: Qwen/Qwen2.5-Math-72B
292
+ layer_range: [48, 49]
293
+ - sources:
294
+ - model: Qwen/Qwen2.5-72B-Instruct
295
+ layer_range: [48, 49]
296
+ - sources:
297
+ - model: Qwen/Qwen2.5-Math-72B
298
+ layer_range: [49, 50]
299
+ - sources:
300
+ - model: Qwen/Qwen2.5-72B-Instruct
301
+ layer_range: [49, 50]
302
+ - sources:
303
+ - model: Qwen/Qwen2.5-Math-72B
304
+ layer_range: [50, 51]
305
+ - sources:
306
+ - model: Qwen/Qwen2.5-72B-Instruct
307
+ layer_range: [50, 51]
308
+ - sources:
309
+ - model: Qwen/Qwen2.5-Math-72B
310
+ layer_range: [51, 52]
311
+ - sources:
312
+ - model: Qwen/Qwen2.5-72B-Instruct
313
+ layer_range: [51, 52]
314
+ - sources:
315
+ - model: Qwen/Qwen2.5-Math-72B
316
+ layer_range: [52, 53]
317
+ - sources:
318
+ - model: Qwen/Qwen2.5-72B-Instruct
319
+ layer_range: [52, 53]
320
+ - sources:
321
+ - model: Qwen/Qwen2.5-Math-72B
322
+ layer_range: [53, 54]
323
+ - sources:
324
+ - model: Qwen/Qwen2.5-72B-Instruct
325
+ layer_range: [53, 54]
326
+ - sources:
327
+ - model: Qwen/Qwen2.5-Math-72B
328
+ layer_range: [54, 55]
329
+ - sources:
330
+ - model: Qwen/Qwen2.5-72B-Instruct
331
+ layer_range: [54, 55]
332
+ - sources:
333
+ - model: Qwen/Qwen2.5-Math-72B
334
+ layer_range: [55, 56]
335
+ - sources:
336
+ - model: Qwen/Qwen2.5-72B-Instruct
337
+ layer_range: [55, 56]
338
+ - sources:
339
+ - model: Qwen/Qwen2.5-Math-72B
340
+ layer_range: [56, 57]
341
+ - sources:
342
+ - model: Qwen/Qwen2.5-72B-Instruct
343
+ layer_range: [56, 57]
344
+ - sources:
345
+ - model: Qwen/Qwen2.5-Math-72B
346
+ layer_range: [57, 58]
347
+ - sources:
348
+ - model: Qwen/Qwen2.5-72B-Instruct
349
+ layer_range: [57, 58]
350
+ - sources:
351
+ - model: Qwen/Qwen2.5-Math-72B
352
+ layer_range: [58, 59]
353
+ - sources:
354
+ - model: Qwen/Qwen2.5-72B-Instruct
355
+ layer_range: [58, 59]
356
+ - sources:
357
+ - model: Qwen/Qwen2.5-Math-72B
358
+ layer_range: [59, 60]
359
+ - sources:
360
+ - model: Qwen/Qwen2.5-72B-Instruct
361
+ layer_range: [59, 60]
362
+ - sources:
363
+ - model: Qwen/Qwen2.5-Math-72B
364
+ layer_range: [60, 61]
365
+ - sources:
366
+ - model: Qwen/Qwen2.5-72B-Instruct
367
+ layer_range: [60, 61]
368
+ - sources:
369
+ - model: Qwen/Qwen2.5-Math-72B
370
+ layer_range: [61, 62]
371
+ - sources:
372
+ - model: Qwen/Qwen2.5-72B-Instruct
373
+ layer_range: [61, 62]
374
+ - sources:
375
+ - model: Qwen/Qwen2.5-Math-72B
376
+ layer_range: [62, 63]
377
+ - sources:
378
+ - model: Qwen/Qwen2.5-72B-Instruct
379
+ layer_range: [62, 63]
380
+ - sources:
381
+ - model: Qwen/Qwen2.5-Math-72B
382
+ layer_range: [63, 64]
383
+ - sources:
384
+ - model: Qwen/Qwen2.5-72B-Instruct
385
+ layer_range: [63, 64]
386
+ - sources:
387
+ - model: Qwen/Qwen2.5-Math-72B
388
+ layer_range: [64, 65]
389
+ - sources:
390
+ - model: Qwen/Qwen2.5-72B-Instruct
391
+ layer_range: [64, 65]
392
+ - sources:
393
+ - model: Qwen/Qwen2.5-Math-72B
394
+ layer_range: [65, 66]
395
+ - sources:
396
+ - model: Qwen/Qwen2.5-72B-Instruct
397
+ layer_range: [65, 66]
398
+ - sources:
399
+ - model: Qwen/Qwen2.5-Math-72B
400
+ layer_range: [66, 67]
401
+ - sources:
402
+ - model: Qwen/Qwen2.5-72B-Instruct
403
+ layer_range: [66, 67]
404
+ - sources:
405
+ - model: Qwen/Qwen2.5-Math-72B
406
+ layer_range: [67, 68]
407
+ - sources:
408
+ - model: Qwen/Qwen2.5-72B-Instruct
409
+ layer_range: [67, 68]
410
+ - sources:
411
+ - model: Qwen/Qwen2.5-Math-72B
412
+ layer_range: [68, 69]
413
+ - sources:
414
+ - model: Qwen/Qwen2.5-72B-Instruct
415
+ layer_range: [68, 69]
416
+ - sources:
417
+ - model: Qwen/Qwen2.5-Math-72B
418
+ layer_range: [69, 70]
419
+ - sources:
420
+ - model: Qwen/Qwen2.5-72B-Instruct
421
+ layer_range: [69, 70]
422
+ - sources:
423
+ - model: Qwen/Qwen2.5-Math-72B
424
+ layer_range: [70, 71]
425
+ - sources:
426
+ - model: Qwen/Qwen2.5-72B-Instruct
427
+ layer_range: [70, 71]
428
+ - sources:
429
+ - model: Qwen/Qwen2.5-Math-72B
430
+ layer_range: [71, 72]
431
+ - sources:
432
+ - model: Qwen/Qwen2.5-72B-Instruct
433
+ layer_range: [71, 72]
434
+ - sources:
435
+ - model: Qwen/Qwen2.5-Math-72B
436
+ layer_range: [72, 73]
437
+ - sources:
438
+ - model: Qwen/Qwen2.5-72B-Instruct
439
+ layer_range: [72, 73]
440
+ - sources:
441
+ - model: Qwen/Qwen2.5-Math-72B
442
+ layer_range: [73, 74]
443
+ - sources:
444
+ - model: Qwen/Qwen2.5-72B-Instruct
445
+ layer_range: [73, 74]
446
+ - sources:
447
+ - model: Qwen/Qwen2.5-Math-72B
448
+ layer_range: [74, 75]
449
+ - sources:
450
+ - model: Qwen/Qwen2.5-72B-Instruct
451
+ layer_range: [74, 75]
452
+ - sources:
453
+ - model: Qwen/Qwen2.5-Math-72B
454
+ layer_range: [75, 76]
455
+ - sources:
456
+ - model: Qwen/Qwen2.5-72B-Instruct
457
+ layer_range: [75, 76]
458
+ - sources:
459
+ - model: Qwen/Qwen2.5-Math-72B
460
+ layer_range: [76, 77]
461
+ - sources:
462
+ - model: Qwen/Qwen2.5-72B-Instruct
463
+ layer_range: [76, 77]
464
+ - sources:
465
+ - model: Qwen/Qwen2.5-Math-72B
466
+ layer_range: [77, 78]
467
+ - sources:
468
+ - model: Qwen/Qwen2.5-72B-Instruct
469
+ layer_range: [77, 78]
470
+ - sources:
471
+ - model: Qwen/Qwen2.5-72B-Instruct
472
+ layer_range: [77, 80]
473
+ merge_method: passthrough
474
+ dtype: float16
merges.txt ADDED
The diff for this file is too large to render. See raw diff
 
model-00001-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d7f2fccee43cfba52f9fd27e22900b99cac1d7fda16d7632ff7dd2fcafcfd46e
3
+ size 4982866376
model-00002-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:457e3044bc9b727e33170a4c968db4c35839e3d7327b2ab6e282057226b02d53
3
+ size 4964068376
model-00003-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0fc86e9a4f6ed59abf26526ac1792ae58001025c696d61c169172ecabfcff95d
3
+ size 4997660360
model-00004-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8f0858a5e74c02287ea7f915226c71640bf47765b84d18421bee632399a1477e
3
+ size 4565680264
model-00005-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3974a3ea9f63406a371ae34f0b47e8442e322cfacc0ffae4355f4895c46aae26
3
+ size 4964068400
model-00006-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b85be6df851bd35f06828fb68e41f7048454f1317095c4b57b221868233cedf7
3
+ size 4915904664
model-00007-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a68b63283e0def0bf732f942d3a5aa16869ed256020a0352f0f2ff41bc29f69c
3
+ size 4647435976
model-00008-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2d37766469556fa46a54f55b9efa37038e93d9c9e5815011a9aae2891abdd5ea
3
+ size 4964068392
model-00009-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1954210340588f28311de0ae3d48a8abf979192aba670167be038f7fe0f0bbaa
3
+ size 4599272248
model-00010-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:8cfeeb3c1c79a84bf3b4470da30d8098ef08587e4af26239df93922842c1fe8d
3
+ size 4964068392
model-00011-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:13a1d1ca7e85f429e294ba016f1d0c22426cc2c61bbee0ab64187ba1f637ba3f
3
+ size 4997660360
model-00012-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1656e3c14c3fb92a770455d6ff3a1fe6329ddd9c7361340753bba76263864d20
3
+ size 4565680264
model-00013-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:410c45fed3c072443bb254b1479e317e54ded45c3c7cc92d8d52923daed7de9b
3
+ size 4964068400
model-00014-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:43cb73a9ceb299ee7ea4bdfaa4d903e19d3540519199a6bf2d1649c51a7f1b9f
3
+ size 4915904664
model-00015-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:23e614264616ba47422bfb6455f3b5b3d84cfbb07aea015f8ea49cc24a6ad8f8
3
+ size 4647435976
model-00016-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4479dd0cbd7381fb8f699663be7cd046c5fd1d57e494a77b84af96d57c5a2129
3
+ size 4964068392
model-00017-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5c62c0a47a452ded148fe8da0dafdfcf8aacc611b93bf3f27aacb45cbf37c4b3
3
+ size 4599272248
model-00018-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e8e9110b4bbb53078fcdc91d3920ff018cb2e520071f219bb0f29f23dd51c1ad
3
+ size 4964068392
model-00019-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dfc26505690c1beee9be97671e50a272ea2413438ebca856a520c9cef723ec71
3
+ size 4997660360
model-00020-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d87bb919676824b2d0405d4e2f7acee2c756989ac9721c1e10a62736b41e1448
3
+ size 4565680264
model-00021-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d255382d546dd09a7bbfad52aeeff1378853282164d2eba4e9b249f916e00555
3
+ size 4964068400
model-00022-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b40c6a4292f8ed0f6bfba6a5f8306b4682083c024e4e70efdb39d88db47cdc65
3
+ size 4915904664
model-00023-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a6f8d7bf579d70739e67c2ac49b6913412ccfaf9d08318c094fd20d4c58f8180
3
+ size 4647435976
model-00024-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f338655cea419fba1eaa56482efa62c2dc88229214fc64613c56e157098d86dd
3
+ size 4964068392
model-00025-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e51113cdaa339293d37ca42b943f276140297bcc95ef88fcca19c3173e2ba509
3
+ size 4599272248
model-00026-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a2b74e2d76b25a82223ba76409f9f8df4e91b0470cabf23105b4169485b4756d
3
+ size 4964068392
model-00027-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b6e9485ea386f8fe951db7d36980ad76fa0e4aa537eef369d8302227eefca0bb
3
+ size 4997660360
model-00028-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f0bcc89e3718d1eb15c12bc76fed42832b5f5c01ad41b2f75422fd6ea8cf5280
3
+ size 4565680264
model-00029-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4572bd9811fb433656e0f5bcff21400a4bfc008fcf6269e383e96e185b9d78d1
3
+ size 4964068400
model-00030-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:e949b1e140370d6990b8ae6135f916666574763de90947c7f5c7b52db7adc671
3
+ size 4915904664
model-00031-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:edb678e7248d6fadec6602236312077c3579b97f0692c673efa24547512aa1b9
3
+ size 4647435976
model-00032-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:722ab773c521a9fc6a0a9a98e54a05f81b99b563dd9dfdea6a111028583238c8
3
+ size 4964068392
model-00033-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dd76c0bac751908481188194742ad8c26af5dd41f976dcef03d1ce98683dbc20
3
+ size 4599272248
model-00034-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:27690d3888b2202d2142a7c845bcf0bef0267514d2514665cd64343350cb0f72
3
+ size 4964068384
model-00035-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2747b348b9d921970ae484386d9c6d4c6a0a56de4dac952fa63bafa7dcee7460
3
+ size 4997660392
model-00036-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:acc1a74f04e481f41a5e4a788a00154690f8359df8927a84f0eacef0463461de
3
+ size 4565680304
model-00037-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7b00596b4006b22f350a65bfab5ae50430c81ce8791fa959325057fb1e87a96c
3
+ size 4964068424
model-00038-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bc7753f74b2769bbd19a574ba9aee29e25c7c43a83a34577bbb426f21180bb27
3
+ size 4915904704
model-00039-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:db932534e798ad6e2fde7f69627c014fafe7ada14026660b2ee17ea2d790a170
3
+ size 4647436016
model-00040-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:02d920f05626f5dd5147cdeb41ef79747587f7e94ac2e1c058c795158bfa0581
3
+ size 4964068416
model-00041-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6792c4f0dbccfbe8fb4a66751a4d0d20306c3e64a1e79abd7b20258b7dc248f9
3
+ size 4599272296
model-00042-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ffe843e3b60b63e4f4d6e0b43c3da8b0164ce1a9ee27dee12ca077353cee78b7
3
+ size 4964068424
model-00043-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:59b871e167c8ccea9f48d2d5e7a1742700e821b32cd1d50cd7a0fc7e2f1ad177
3
+ size 4997660392
model-00044-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:0023caaae0e8b5d4fe9e0b596095204df9ccceb3b897d6a34af16e9d96f16ff7
3
+ size 4565680304
model-00045-of-00059.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dce68003487270aebbf37c3d9d6743e69c0664645d96ff91447893b3095d32bb
3
+ size 4964068424