MarsupialAI
/

Melusine_103b

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

MarsupialAI commited on Feb 2, 2024

Commit

497c27f

·

verified ·

1 Parent(s): fc3e3d4

Create README.md

Files changed (1) hide show

README.md +51 -0

README.md ADDED Viewed

	@@ -0,0 +1,51 @@

+---
+language:
+- en
+tags:
+- rp
+- erp
+- chat
+- storywriting
+---
+# Kitchen Sink 103b
+![image/jpeg](https://cdn-uploads.huggingface.co/production/uploads/65a531bc7ec6af0f95c707b1/NmlQi3OggdYF_Tyb_cs3D.jpeg)
+This model is a rotating-stack merge of three 70b models in a 103b (120 layer) configuration inspired by Venus 103b.  The result of
+this "frankenmerge" is a large model that contains a little bit of everything.  RP, chat, storywriting,
+and instruct are all well supported.
+Component models for the rotating stack are
+- miqudev/miqu-1-70b
+- royallab/Aetheria-L2-70B
+- lizpreciatior/lzlv_70b_fp16_hf
+This model is *mostly* uncensored and is capable of generating objectionable material with a suitable prompt.  However it is not an explicitely-NSFW model,
+and some remnants of Miqu's censoring do occasionally pop up.  As with any LLM, no factual claims
+made by the model should be taken at face value.  You know that boilerplate safety disclaimer that most professional models have?
+Assume this has it too.  This model is for entertainment purposes only.
+FP16 and Q4_K_S GGUFs:
+# Sample output
+```
+{{[INPUT]}}
+Write a detailed and humorous story about a cute and fluffy bunny that goes to a Gwar concert.
+```
+# Prompt format
+Seems to have the strongest affinity for Alpaca prompts.  Others will work to some extent.
+# WTF is a rotating-stack merge?
+Inspired by Undi's experiments with stacked merges, Jeb Carter found that output quality and model initiative could be significantly
+improved by reversing the model order in the stack, and then doing a linear merge between the original and reversed stacks.  That is
+what I did here.  To preserve as much of the "smarts" and long-context-awareness from miqu as possible while still adding the flavor
+from the other models, there is effectively twice as much miqu as lzlv or aetheria.  The exact merge configs can be found in the
+recipe.txt file.