inflatebot
/

MN-12B-Mag-Mell-R1

@@ -16,13 +16,44 @@ tags:
 ![Made with NovelAI](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1/resolve/main/magmell.png)
 *[Welcome, brave one; you've come a long mile.](https://www.youtube.com/watch?v=dgGEuC1F3oE)*
-[Official GGUFs](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1-GGUF)
-[More from mradermacher](https://huggingface.co/mradermacher/MN-12B-Mag-Mell-R1-GGUF/tree/main)
 # MN-12B-Mag-Mell-R1
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
 ## Merge Details
 Multi-stage SLERP merge, DARE-TIES'd together. Intended to be a general purpose "Best of Nemo" model for any fictional, creative use case. Inspired by hyper-merges like [Tiefighter](https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter) and [Umbral Mind.](https://huggingface.co/Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B)

 ![Made with NovelAI](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1/resolve/main/magmell.png)
 *[Welcome, brave one; you've come a long mile.](https://www.youtube.com/watch?v=dgGEuC1F3oE)*
 # MN-12B-Mag-Mell-R1
 This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
+[Q4_K_M, Q6_K and Q_8 GGUFs by me](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1-GGUF)
+[More available from mradermacher](https://huggingface.co/mradermacher/MN-12B-Mag-Mell-R1-GGUF/tree/main)
+## Usage Details
+### Sampler Settings
+Mag Mell R1 was tested with Temp 1.25 and MinP 0.2. This was fairly stable up to 10K, but this might be too "hot".
+If issues with coherency occur, try *in*creasing MinP or *de*creasing Temperature.
+Other samplers shouldn't be necessary. XTC was shown to break outputs. DRY should be okay if used sparingly. Other penalty-type samplers should probably be avoided.
+### Formatting
+The base model for Mag Mell is [Mistral-Nemo-Base-2407-chatml](https://huggingface.co/IntervitensInc/Mistral-Nemo-Base-2407-chatml), and as such ChatML formatting is recommended.
+However, many component models still use Mistral's format. As a result, occasionally the word "user" or "assistant" will appear on the bottom of the screen.
+__However.__ Some things have come out regarding Mistral's format that should be covered here, and implicates not just Mag Mell, but *all* Mistral-based models since the original Mistral 7B.
+*The following information is as correct as I can get it as of September 20th, 2024*
+We've had Mistral's tokenizer handling and completions format all wrong. *The templates in your frontend are probably wrong right now.*
+MistralAI member Pandora has been going around helping to correct everyone.
+Right now, Pandora has opened PRs for [SillyTavern](https://github.com/SillyTavern/SillyTavern/pull/2883), [KoboldAI Lite](https://github.com/LostRuins/lite.koboldai.net/pull/87/files) and [KoboldCPP](https://github.com/LostRuins/koboldcpp/pull/1131).
+*When these are merged*, then the templates in them can be assumed to be completely correct.
+Until then, I've [provided templates for SillyTavern on GitHub that should be More Correct than the ones ST currently ships.](https://github.com/inflatebot/SillyTavern-Nemo-Templates)
+If you experiment with this, please let me know how it goes! The conversation on how to properly implement Mistral is still ongoing.
 ## Merge Details
 Multi-stage SLERP merge, DARE-TIES'd together. Intended to be a general purpose "Best of Nemo" model for any fictional, creative use case. Inspired by hyper-merges like [Tiefighter](https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter) and [Umbral Mind.](https://huggingface.co/Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B)