MarsupialAI
commited on
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
language:
|
3 |
+
- en
|
4 |
+
tags:
|
5 |
+
- rp
|
6 |
+
- erp
|
7 |
+
- chat
|
8 |
+
- storywriting
|
9 |
+
---
|
10 |
+
|
11 |
+
# Kitchen Sink 103b
|
12 |
+
|
13 |
+
![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/65a531bc7ec6af0f95c707b1/NmlQi3OggdYF_Tyb_cs3D.jpeg)
|
14 |
+
|
15 |
+
This model is a rotating-stack merge of three 70b models in a 103b (120 layer) configuration inspired by Venus 103b. The result of
|
16 |
+
this "frankenmerge" is a large model that contains a little bit of everything. RP, chat, storywriting,
|
17 |
+
and instruct are all well supported.
|
18 |
+
|
19 |
+
Component models for the rotating stack are
|
20 |
+
- miqudev/miqu-1-70b
|
21 |
+
- royallab/Aetheria-L2-70B
|
22 |
+
- lizpreciatior/lzlv_70b_fp16_hf
|
23 |
+
|
24 |
+
This model is *mostly* uncensored and is capable of generating objectionable material with a suitable prompt. However it is not an explicitely-NSFW model,
|
25 |
+
and some remnants of Miqu's censoring do occasionally pop up. As with any LLM, no factual claims
|
26 |
+
made by the model should be taken at face value. You know that boilerplate safety disclaimer that most professional models have?
|
27 |
+
Assume this has it too. This model is for entertainment purposes only.
|
28 |
+
|
29 |
+
|
30 |
+
FP16 and Q4_K_S GGUFs:
|
31 |
+
|
32 |
+
|
33 |
+
# Sample output
|
34 |
+
|
35 |
+
```
|
36 |
+
{{[INPUT]}}
|
37 |
+
Write a detailed and humorous story about a cute and fluffy bunny that goes to a Gwar concert.
|
38 |
+
```
|
39 |
+
|
40 |
+
|
41 |
+
|
42 |
+
# Prompt format
|
43 |
+
Seems to have the strongest affinity for Alpaca prompts. Others will work to some extent.
|
44 |
+
|
45 |
+
|
46 |
+
# WTF is a rotating-stack merge?
|
47 |
+
Inspired by Undi's experiments with stacked merges, Jeb Carter found that output quality and model initiative could be significantly
|
48 |
+
improved by reversing the model order in the stack, and then doing a linear merge between the original and reversed stacks. That is
|
49 |
+
what I did here. To preserve as much of the "smarts" and long-context-awareness from miqu as possible while still adding the flavor
|
50 |
+
from the other models, there is effectively twice as much miqu as lzlv or aetheria. The exact merge configs can be found in the
|
51 |
+
recipe.txt file.
|