Masterjp123 commited on
Commit
b241e04
1 Parent(s): 50933ce

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -1
README.md CHANGED
@@ -1,3 +1,62 @@
1
  ---
2
- license: llama2
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - Riiid/sheep-duck-llama-2-13b
4
+ - IkariDev/Athena-v4
5
+ - TheBloke/Llama-2-13B-fp16
6
+ - KoboldAI/LLaMA2-13B-Psyfighter2
7
+ tags:
8
+ - mergekit
9
+ - merge
10
+
11
  ---
12
+ # merged
13
+
14
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
15
+
16
+ ## Merge Details
17
+ ### Merge Method
18
+
19
+ This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [TheBloke/Llama-2-13B-fp16](https://huggingface.co/TheBloke/Llama-2-13B-fp16) as a base.
20
+
21
+ ### Models Merged
22
+
23
+ The following models were included in the merge:
24
+ * [Riiid/sheep-duck-llama-2-13b](https://huggingface.co/Riiid/sheep-duck-llama-2-13b)
25
+ * [IkariDev/Athena-v4](https://huggingface.co/IkariDev/Athena-v4)
26
+ * [KoboldAI/LLaMA2-13B-Psyfighter2](https://huggingface.co/KoboldAI/LLaMA2-13B-Psyfighter2)
27
+
28
+ ### Configuration
29
+
30
+ The following YAML configuration was used to produce this model:
31
+
32
+ ```yaml
33
+ base_model:
34
+ model:
35
+ path: TheBloke/Llama-2-13B-fp16
36
+ dtype: bfloat16
37
+ merge_method: task_arithmetic
38
+ slices:
39
+ - sources:
40
+ - layer_range: [0, 40]
41
+ model:
42
+ model:
43
+ path: TheBloke/Llama-2-13B-fp16
44
+ - layer_range: [0, 40]
45
+ model:
46
+ model:
47
+ path: KoboldAI/LLaMA2-13B-Psyfighter2
48
+ parameters:
49
+ weight: 1.0
50
+ - layer_range: [0, 40]
51
+ model:
52
+ model:
53
+ path: Riiid/sheep-duck-llama-2-13b
54
+ parameters:
55
+ weight: 0.45
56
+ - layer_range: [0, 40]
57
+ model:
58
+ model:
59
+ path: IkariDev/Athena-v4
60
+ parameters:
61
+ weight: 0.33
62
+ ```