saishf commited on
Commit
930d652
1 Parent(s): e08793e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -1
README.md CHANGED
@@ -1,3 +1,59 @@
1
  ---
2
- license: apache-2.0
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ base_model:
3
+ - NousResearch/Nous-Hermes-2-SOLAR-10.7B
4
+ - BlueNipples/SnowLotus-v2-10.7B
5
+ tags:
6
+ - mergekit
7
+ - merge
8
+
9
  ---
10
+ These are GGUF quants for https://huggingface.co/saishf/Nous-Lotus-10.7B
11
+ # merge
12
+
13
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
14
+
15
+ ## Merge Details
16
+ This model is a slerp between SnowLotus-v2 & Nous-Hermes-2-SOLAR, I found snowlotus was awesome to talk to but lacked when prompting with out-there characters. Nous Hermes seemed to handle those characters a lot better, so i decided to merge the two.
17
+
18
+ This is my first merge so it could perform badly or may not even work
19
+ ### Extra Info
20
+ Both models are solar based so context should be 4096
21
+
22
+ SnowLotus uses Alpaca
23
+
24
+ Nous Hermes uses ChatML
25
+
26
+ Both seem to work but i don't exactly know which performs better
27
+ ### Merge Method
28
+
29
+ This model was merged using the SLERP merge method.
30
+
31
+ ### Models Merged
32
+
33
+ The following models were included in the merge:
34
+ * [NousResearch/Nous-Hermes-2-SOLAR-10.7B](https://huggingface.co/NousResearch/Nous-Hermes-2-SOLAR-10.7B)
35
+ * [BlueNipples/SnowLotus-v2-10.7B](https://huggingface.co/BlueNipples/SnowLotus-v2-10.7B)
36
+
37
+ ### Configuration
38
+
39
+ The following YAML configuration was used to produce this model:
40
+
41
+ ```yaml
42
+ slices:
43
+ - sources:
44
+ - model: BlueNipples/SnowLotus-v2-10.7B
45
+ layer_range: [0, 48]
46
+ - model: NousResearch/Nous-Hermes-2-SOLAR-10.7B
47
+ layer_range: [0, 48]
48
+ merge_method: slerp
49
+ base_model: BlueNipples/SnowLotus-v2-10.7B
50
+ parameters:
51
+ t:
52
+ - filter: self_attn
53
+ value: [0, 0.5, 0.3, 0.7, 1]
54
+ - filter: mlp
55
+ value: [1, 0.5, 0.7, 0.3, 0]
56
+ - value: 0.5
57
+ dtype: bfloat16
58
+
59
+ ```