asiansoul commited on
Commit
2388789
1 Parent(s): b099c95

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +97 -5
README.md CHANGED
@@ -1,5 +1,97 @@
1
- ---
2
- license: other
3
- license_name: other
4
- license_link: LICENSE
5
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - nayohan/llama3-8b-it-translation-general-en-ko-1sent
4
+ - MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
5
+ - cognitivecomputations/dolphin-2.9-llama3-8b
6
+ - NousResearch/Hermes-2-Pro-Llama-3-8B
7
+ - winglian/llama-3-8b-1m-PoSE
8
+ - asiansoul/Llama-3-Open-Ko-Linear-8B
9
+ - NousResearch/Meta-Llama-3-8B
10
+ - Danielbrdz/Barcenas-Llama3-8b-ORPO
11
+ - NousResearch/Meta-Llama-3-8B-Instruct
12
+ library_name: transformers
13
+ tags:
14
+ - mergekit
15
+ - merge
16
+
17
+ ---
18
+ # Versatile-Llama-3-8B
19
+ I'm not going to say that this Merge model is the best model ever made. I'm not going to tell you that you'll enjoy chatting with my merge model.
20
+
21
+ All I want to say is thank you for taking time out of your day to visit. Without users like you, my work would be meaningless.
22
+
23
+ Not test this model, just based on brainstorming, release this.
24
+
25
+ ## Merge Details
26
+ ### Merge Method
27
+
28
+ This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) as a base.
29
+
30
+ ### Models Merged
31
+
32
+ The following models were included in the merge:
33
+ * [nayohan/llama3-8b-it-translation-general-en-ko-1sent](https://huggingface.co/nayohan/llama3-8b-it-translation-general-en-ko-1sent)
34
+ * [MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3](https://huggingface.co/MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3)
35
+ * [cognitivecomputations/dolphin-2.9-llama3-8b](https://huggingface.co/cognitivecomputations/dolphin-2.9-llama3-8b)
36
+ * [NousResearch/Hermes-2-Pro-Llama-3-8B](https://huggingface.co/NousResearch/Hermes-2-Pro-Llama-3-8B)
37
+ * [winglian/llama-3-8b-1m-PoSE](https://huggingface.co/winglian/llama-3-8b-1m-PoSE)
38
+ * [asiansoul/Llama-3-Open-Ko-Linear-8B](https://huggingface.co/asiansoul/Llama-3-Open-Ko-Linear-8B)
39
+ * [Danielbrdz/Barcenas-Llama3-8b-ORPO](https://huggingface.co/Danielbrdz/Barcenas-Llama3-8b-ORPO)
40
+ * [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
41
+
42
+ ### Configuration
43
+
44
+ The following YAML configuration was used to produce this model:
45
+
46
+ ```yaml
47
+ models:
48
+ - model: NousResearch/Meta-Llama-3-8B
49
+ # Base model providing a general foundation without specific parameters
50
+
51
+ - model: NousResearch/Meta-Llama-3-8B-Instruct
52
+ parameters:
53
+ density: 0.60
54
+ weight: 0.25
55
+
56
+ - model: winglian/llama-3-8b-1m-PoSE
57
+ parameters:
58
+ density: 0.55
59
+ weight: 0.15
60
+
61
+ - model: MaziyarPanahi/Llama-3-8B-Instruct-DPO-v0.3
62
+ parameters:
63
+ density: 0.55
64
+ weight: 0.15
65
+
66
+ - model: asiansoul/Llama-3-Open-Ko-Linear-8B
67
+ parameters:
68
+ density: 0.55
69
+ weight: 0.2
70
+
71
+ - model: nayohan/llama3-8b-it-translation-general-en-ko-1sent
72
+ parameters:
73
+ density: 0.55
74
+ weight: 0.1
75
+
76
+ - model: cognitivecomputations/dolphin-2.9-llama3-8b
77
+ parameters:
78
+ density: 0.55
79
+ weight: 0.1
80
+
81
+ - model: Danielbrdz/Barcenas-Llama3-8b-ORPO
82
+ parameters:
83
+ density: 0.55
84
+ weight: 0.05
85
+
86
+ - model: NousResearch/Hermes-2-Pro-Llama-3-8B
87
+ parameters:
88
+ density: 0.55
89
+ weight: 0.1
90
+
91
+ merge_method: dare_ties
92
+ base_model: NousResearch/Meta-Llama-3-8B
93
+ parameters:
94
+ int8_mask: true
95
+ dtype: bfloat16
96
+
97
+ ```