Azazelle commited on
Commit
49c3059
1 Parent(s): 5f4d012

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +82 -23
README.md CHANGED
@@ -5,43 +5,102 @@ library_name: transformers
5
  tags:
6
  - mergekit
7
  - merge
8
-
 
 
9
  ---
10
- # gCHdqlR
 
 
 
 
11
 
12
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
13
 
14
- ## Merge Details
15
- ### Merge Method
16
 
17
- This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) as a base.
 
 
 
 
 
 
18
 
19
- ### Models Merged
20
 
21
- The following models were included in the merge:
22
- * output/hq_rp
23
 
24
  ### Configuration
25
 
26
  The following YAML configuration was used to produce this model:
27
 
28
  ```yaml
29
- base_model: NousResearch/Meta-Llama-3-8B-Instruct
30
- dtype: float32
 
 
 
 
 
 
 
31
  merge_method: task_arithmetic
 
32
  parameters:
33
- normalize: 0.0
34
- slices:
35
- - sources:
36
- - layer_range: [0, 32]
37
- model: output/hq_rp
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
38
  parameters:
39
  weight:
40
- - filter: mlp
41
- value: 1.15
42
- - filter: self_attn
43
- value: 1.025
44
- - value: 1.0
45
- - layer_range: [0, 32]
46
- model: NousResearch/Meta-Llama-3-8B-Instruct
47
- ```
 
 
 
 
 
5
  tags:
6
  - mergekit
7
  - merge
8
+ - llama
9
+ - conversational
10
+ license: llama3
11
  ---
12
+ # L3-Hecate-8B-v1.2
13
+
14
+ ![Hecate](https://huggingface.co/Azazelle/L3-Hecate-8B-v1.2/resolve/main/img-lk8aRDQYDBJf0C02UowUk.jpeg)
15
+
16
+ ## About:
17
 
18
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
19
 
20
+ **Recommended Samplers:**
 
21
 
22
+ ```
23
+ Temperature - 1.0
24
+ TFS - 0.7
25
+ Smoothing Factor - 0.3
26
+ Smoothing Curve - 1.1
27
+ Repetition Penalty - 1.08
28
+ ```
29
 
30
+ ### Merge Method
31
 
32
+ This model was merged a series of model stock, followed by ExPO. It uses a mix of roleplay models to improve performance.
 
33
 
34
  ### Configuration
35
 
36
  The following YAML configuration was used to produce this model:
37
 
38
  ```yaml
39
+ ---
40
+ # Concise-Mopey
41
+ models:
42
+ - model: Salesforce/LLaMA-3-8B-SFR-Iterative-DPO-Concise-R
43
+ parameters:
44
+ weight: 1.0
45
+ - model: failspy/Llama-3-8B-Instruct-MopeyMule
46
+ parameters:
47
+ weight: 1.0
48
  merge_method: task_arithmetic
49
+ base_model: NousResearch/Meta-Llama-3-8B-Instruct
50
  parameters:
51
+ normalize: false
52
+ dtype: float32
53
+ vocab_type: bpe
54
+ name: Concise-Mopey
55
+
56
+ ---
57
+ # Mopey RP Mix
58
+ models:
59
+ - model: Concise-Mopey+Azazelle/Llama-3-Sunfall-8b-lora
60
+ - model: Concise-Mopey+Azazelle/Llama-3-8B-Abomination-LORA
61
+ - model: Concise-Mopey+Azazelle/llama3-8b-hikikomori-v0.4
62
+ - model: Concise-Mopey+Azazelle/Llama-3-Instruct-LiPPA-LoRA-8B
63
+ - model: Concise-Mopey+Azazelle/BlueMoon_Llama3
64
+ - model: Concise-Mopey+Azazelle/Llama3_RP_ORPO_LoRA
65
+ - model: Concise-Mopey+mpasila/Llama-3-LimaRP-Instruct-LoRA-8B
66
+ - model: Concise-Mopey+Azazelle/Llama-3-LongStory-LORA
67
+ merge_method: model_stock
68
+ base_model: failspy/Llama-3-8B-Instruct-MopeyMule
69
+ dtype: float32
70
+ vocab_type: bpe
71
+ name: mopey_rp
72
+
73
+ ---
74
+ models:
75
+ - model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
76
+ - model: Sao10K/L3-8B-Tamamo-v1
77
+ - model: Sao10K/L3-8B-Niitama-v1
78
+ - model: Hastagaras/Jamet-8B-L3-MK.V-Blackroot
79
+ - model: nothingiisreal/L3-8B-Celeste-v1
80
+ - model: Jellywibble/lora_120k_pref_data_ep2
81
+ - model: Nitral-AI/Hathor_Stable-v0.2-L3-8B
82
+ - model: mopey_rp
83
+ merge_method: model_stock
84
+ base_model: NousResearch/Meta-Llama-3-8B-Instruct
85
+ dtype: float32
86
+ vocab_type: bpe
87
+ name: hq_rp
88
+
89
+ ---
90
+ # ExPO
91
+ models:
92
+ - model: hq_rp
93
  parameters:
94
  weight:
95
+ - filter: mlp
96
+ value: 1.15
97
+ - filter: self_attn
98
+ value: 1.025
99
+ - value: 1.0
100
+ merge_method: task_arithmetic
101
+ base_model: NousResearch/Meta-Llama-3-8B-Instruct
102
+ parameters:
103
+ normalize: false
104
+ dtype: float32
105
+ vocab_type: bpe
106
+ ```