grimjim commited on
Commit
3d7cde1
1 Parent(s): e10c6a3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +58 -59
README.md CHANGED
@@ -1,59 +1,58 @@
1
- ---
2
- language:
3
- - en
4
- base_model:
5
- - grimjim/llama-3-merge-virt-req-8B
6
- - nbeerbower/llama-3-slerp-kraut-dragon-8B
7
- library_name: transformers
8
- tags:
9
- - meta
10
- - llama-3
11
- - pytorch
12
- - mergekit
13
- - merge
14
- license: llama3
15
- license_link: LICENSE
16
- pipeline_tag: text-generation
17
- ---
18
- # llama-3-merge-avalon-8B
19
-
20
- This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
21
-
22
- Lightly tested at temperature=1.0, minP=0.02 with provisional Llama 3 Instruct prompts.
23
-
24
- Built with Meta Llama 3.
25
-
26
- ## Merge Details
27
- ### Merge Method
28
-
29
- This model was merged using the SLERP merge method.
30
-
31
- ### Models Merged
32
-
33
- The following models were included in the merge:
34
- * [grimjim/llama-3-merge-virt-req-8B](https://huggingface.co/grimjim/llama-3-merge-virt-req-8B)
35
- * [nbeerbower/llama-3-slerp-kraut-dragon-8B](https://huggingface.co/nbeerbower/llama-3-slerp-kraut-dragon-8B)
36
-
37
- ### Configuration
38
-
39
- The following YAML configuration was used to produce this model:
40
-
41
- ```yaml
42
- slices:
43
- - sources:
44
- - model: grimjim/llama-3-merge-virt-req-8B
45
- layer_range: [0,32]
46
- - model: nbeerbower/llama-3-slerp-kraut-dragon-8B
47
- layer_range: [0,32]
48
- merge_method: slerp
49
- base_model: grimjim/llama-3-merge-virt-req-8B
50
- parameters:
51
- t:
52
- - filter: self_attn
53
- value: [0, 0.5, 0.3, 0.7, 1]
54
- - filter: mlp
55
- value: [1, 0.5, 0.7, 0.3, 0]
56
- - value: 0.5
57
- dtype: bfloat16
58
-
59
- ```
 
1
+ ---
2
+ language:
3
+ - en
4
+ base_model:
5
+ - grimjim/llama-3-merge-virt-req-8B
6
+ - nbeerbower/llama-3-slerp-kraut-dragon-8B
7
+ library_name: transformers
8
+ tags:
9
+ - meta
10
+ - llama-3
11
+ - pytorch
12
+ - mergekit
13
+ - merge
14
+ license: cc-by-nc-4.0
15
+ pipeline_tag: text-generation
16
+ ---
17
+ # llama-3-merge-avalon-8B
18
+
19
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
20
+
21
+ Lightly tested at temperature=1.0, minP=0.02 with provisional Llama 3 Instruct prompts.
22
+
23
+ Built with Meta Llama 3.
24
+
25
+ ## Merge Details
26
+ ### Merge Method
27
+
28
+ This model was merged using the SLERP merge method.
29
+
30
+ ### Models Merged
31
+
32
+ The following models were included in the merge:
33
+ * [grimjim/llama-3-merge-virt-req-8B](https://huggingface.co/grimjim/llama-3-merge-virt-req-8B)
34
+ * [nbeerbower/llama-3-slerp-kraut-dragon-8B](https://huggingface.co/nbeerbower/llama-3-slerp-kraut-dragon-8B)
35
+
36
+ ### Configuration
37
+
38
+ The following YAML configuration was used to produce this model:
39
+
40
+ ```yaml
41
+ slices:
42
+ - sources:
43
+ - model: grimjim/llama-3-merge-virt-req-8B
44
+ layer_range: [0,32]
45
+ - model: nbeerbower/llama-3-slerp-kraut-dragon-8B
46
+ layer_range: [0,32]
47
+ merge_method: slerp
48
+ base_model: grimjim/llama-3-merge-virt-req-8B
49
+ parameters:
50
+ t:
51
+ - filter: self_attn
52
+ value: [0, 0.5, 0.3, 0.7, 1]
53
+ - filter: mlp
54
+ value: [1, 0.5, 0.7, 0.3, 0]
55
+ - value: 0.5
56
+ dtype: bfloat16
57
+
58
+ ```