inflatebot
commited on
Commit
•
8b48d49
1
Parent(s):
e3eab07
Update README.md
Browse files
README.md
CHANGED
@@ -16,13 +16,44 @@ tags:
|
|
16 |
![Made with NovelAI](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1/resolve/main/magmell.png)
|
17 |
*[Welcome, brave one; you've come a long mile.](https://www.youtube.com/watch?v=dgGEuC1F3oE)*
|
18 |
|
19 |
-
[Official GGUFs](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1-GGUF)
|
20 |
-
|
21 |
-
[More from mradermacher](https://huggingface.co/mradermacher/MN-12B-Mag-Mell-R1-GGUF/tree/main)
|
22 |
# MN-12B-Mag-Mell-R1
|
23 |
|
24 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
25 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
26 |
## Merge Details
|
27 |
Multi-stage SLERP merge, DARE-TIES'd together. Intended to be a general purpose "Best of Nemo" model for any fictional, creative use case. Inspired by hyper-merges like [Tiefighter](https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter) and [Umbral Mind.](https://huggingface.co/Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B)
|
28 |
|
|
|
16 |
![Made with NovelAI](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1/resolve/main/magmell.png)
|
17 |
*[Welcome, brave one; you've come a long mile.](https://www.youtube.com/watch?v=dgGEuC1F3oE)*
|
18 |
|
|
|
|
|
|
|
19 |
# MN-12B-Mag-Mell-R1
|
20 |
|
21 |
This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
|
22 |
|
23 |
+
[Q4_K_M, Q6_K and Q_8 GGUFs by me](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1-GGUF)
|
24 |
+
|
25 |
+
[More available from mradermacher](https://huggingface.co/mradermacher/MN-12B-Mag-Mell-R1-GGUF/tree/main)
|
26 |
+
|
27 |
+
## Usage Details
|
28 |
+
|
29 |
+
### Sampler Settings
|
30 |
+
Mag Mell R1 was tested with Temp 1.25 and MinP 0.2. This was fairly stable up to 10K, but this might be too "hot".
|
31 |
+
If issues with coherency occur, try *in*creasing MinP or *de*creasing Temperature.
|
32 |
+
|
33 |
+
Other samplers shouldn't be necessary. XTC was shown to break outputs. DRY should be okay if used sparingly. Other penalty-type samplers should probably be avoided.
|
34 |
+
|
35 |
+
|
36 |
+
### Formatting
|
37 |
+
The base model for Mag Mell is [Mistral-Nemo-Base-2407-chatml](https://huggingface.co/IntervitensInc/Mistral-Nemo-Base-2407-chatml), and as such ChatML formatting is recommended.
|
38 |
+
|
39 |
+
However, many component models still use Mistral's format. As a result, occasionally the word "user" or "assistant" will appear on the bottom of the screen.
|
40 |
+
|
41 |
+
|
42 |
+
__However.__ Some things have come out regarding Mistral's format that should be covered here, and implicates not just Mag Mell, but *all* Mistral-based models since the original Mistral 7B.
|
43 |
+
|
44 |
+
*The following information is as correct as I can get it as of September 20th, 2024*
|
45 |
+
|
46 |
+
We've had Mistral's tokenizer handling and completions format all wrong. *The templates in your frontend are probably wrong right now.*
|
47 |
+
|
48 |
+
MistralAI member Pandora has been going around helping to correct everyone.
|
49 |
+
|
50 |
+
Right now, Pandora has opened PRs for [SillyTavern](https://github.com/SillyTavern/SillyTavern/pull/2883), [KoboldAI Lite](https://github.com/LostRuins/lite.koboldai.net/pull/87/files) and [KoboldCPP](https://github.com/LostRuins/koboldcpp/pull/1131).
|
51 |
+
|
52 |
+
*When these are merged*, then the templates in them can be assumed to be completely correct.
|
53 |
+
|
54 |
+
Until then, I've [provided templates for SillyTavern on GitHub that should be More Correct than the ones ST currently ships.](https://github.com/inflatebot/SillyTavern-Nemo-Templates)
|
55 |
+
If you experiment with this, please let me know how it goes! The conversation on how to properly implement Mistral is still ongoing.
|
56 |
+
|
57 |
## Merge Details
|
58 |
Multi-stage SLERP merge, DARE-TIES'd together. Intended to be a general purpose "Best of Nemo" model for any fictional, creative use case. Inspired by hyper-merges like [Tiefighter](https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter) and [Umbral Mind.](https://huggingface.co/Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B)
|
59 |
|