HiroseKoichi commited on
Commit
7cf3e2d
1 Parent(s): 9f0869c

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +49 -0
README.md ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: llama3
3
+ library_name: transformers
4
+ tags:
5
+ - nsfw
6
+ - not-for-all-audiences
7
+ - llama-3
8
+ - text-generation-inference
9
+ - mergekit
10
+ - merge
11
+ ---
12
+
13
+ Original model: https://huggingface.co/HiroseKoichi/L3-8B-Lunar-Stheno
14
+
15
+ # L3-8B-Lunar-Stheno
16
+ L3-8B-Lunaris-v1 is definitely a significant improvement over L3-8B-Stheno-v3.2 in terms of situational awareness and prose, but it's not without issues: the response length can sometimes be very long, causing it to go on a rant; it tends to not take direct action, saying that it will do something but never actually doing it; and its performance outside of roleplay took a hit.
17
+
18
+ This merge fixes all of those issues, and I'm genuinely impressed with the results. While I did use a SLERP merge to create this model, there was no blending of the models; all I did was replace L3-8B-Stheno-v3.2's weights with L3-8B-Lunaris-v1's.
19
+
20
+ # Experimental Quants Included
21
+ There's a full set of quants available, but half of them include an experimental quantization method, which is indicated by the prefix `f16` before the quantization level. These quants keep the embeddings and output tensors at f16, but is otherwise the same as the rest.
22
+
23
+ The experimental variants should be higher quality than their standard equivalent, but any feedback is welcome.
24
+
25
+ # Details
26
+ - **License**: [llama3](https://llama.meta.com/llama3/license/)
27
+ - **Instruct Format**: [llama-3](https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/)
28
+ - **Context Size**: 8K
29
+
30
+ ## Models Used
31
+ - [L3-8B-Stheno-v3.2](https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2)
32
+ - [L3-8B-Lunaris-v1](https://huggingface.co/Sao10K/L3-8B-Lunaris-v1)
33
+
34
+ ## Merge Config
35
+ ```yaml
36
+ models:
37
+ - model: Sao10K/L3-8B-Stheno-v3.2
38
+ - model: Sao10K/L3-8B-Lunaris-v1
39
+ merge_method: slerp
40
+ base_model: Sao10K/L3-8B-Stheno-v3.2
41
+ parameters:
42
+ t:
43
+ - filter: self_attn
44
+ value: 0
45
+ - filter: mlp
46
+ value: 1
47
+ - value: 0
48
+ dtype: bfloat16
49
+ ```