inflatebot commited on
Commit
8b48d49
1 Parent(s): e3eab07

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -3
README.md CHANGED
@@ -16,13 +16,44 @@ tags:
16
  ![Made with NovelAI](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1/resolve/main/magmell.png)
17
  *[Welcome, brave one; you've come a long mile.](https://www.youtube.com/watch?v=dgGEuC1F3oE)*
18
 
19
- [Official GGUFs](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1-GGUF)
20
-
21
- [More from mradermacher](https://huggingface.co/mradermacher/MN-12B-Mag-Mell-R1-GGUF/tree/main)
22
  # MN-12B-Mag-Mell-R1
23
 
24
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
26
  ## Merge Details
27
  Multi-stage SLERP merge, DARE-TIES'd together. Intended to be a general purpose "Best of Nemo" model for any fictional, creative use case. Inspired by hyper-merges like [Tiefighter](https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter) and [Umbral Mind.](https://huggingface.co/Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B)
28
 
 
16
  ![Made with NovelAI](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1/resolve/main/magmell.png)
17
  *[Welcome, brave one; you've come a long mile.](https://www.youtube.com/watch?v=dgGEuC1F3oE)*
18
 
 
 
 
19
  # MN-12B-Mag-Mell-R1
20
 
21
  This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
22
 
23
+ [Q4_K_M, Q6_K and Q_8 GGUFs by me](https://huggingface.co/inflatebot/MN-12B-Mag-Mell-R1-GGUF)
24
+
25
+ [More available from mradermacher](https://huggingface.co/mradermacher/MN-12B-Mag-Mell-R1-GGUF/tree/main)
26
+
27
+ ## Usage Details
28
+
29
+ ### Sampler Settings
30
+ Mag Mell R1 was tested with Temp 1.25 and MinP 0.2. This was fairly stable up to 10K, but this might be too "hot".
31
+ If issues with coherency occur, try *in*creasing MinP or *de*creasing Temperature.
32
+
33
+ Other samplers shouldn't be necessary. XTC was shown to break outputs. DRY should be okay if used sparingly. Other penalty-type samplers should probably be avoided.
34
+
35
+
36
+ ### Formatting
37
+ The base model for Mag Mell is [Mistral-Nemo-Base-2407-chatml](https://huggingface.co/IntervitensInc/Mistral-Nemo-Base-2407-chatml), and as such ChatML formatting is recommended.
38
+
39
+ However, many component models still use Mistral's format. As a result, occasionally the word "user" or "assistant" will appear on the bottom of the screen.
40
+
41
+
42
+ __However.__ Some things have come out regarding Mistral's format that should be covered here, and implicates not just Mag Mell, but *all* Mistral-based models since the original Mistral 7B.
43
+
44
+ *The following information is as correct as I can get it as of September 20th, 2024*
45
+
46
+ We've had Mistral's tokenizer handling and completions format all wrong. *The templates in your frontend are probably wrong right now.*
47
+
48
+ MistralAI member Pandora has been going around helping to correct everyone.
49
+
50
+ Right now, Pandora has opened PRs for [SillyTavern](https://github.com/SillyTavern/SillyTavern/pull/2883), [KoboldAI Lite](https://github.com/LostRuins/lite.koboldai.net/pull/87/files) and [KoboldCPP](https://github.com/LostRuins/koboldcpp/pull/1131).
51
+
52
+ *When these are merged*, then the templates in them can be assumed to be completely correct.
53
+
54
+ Until then, I've [provided templates for SillyTavern on GitHub that should be More Correct than the ones ST currently ships.](https://github.com/inflatebot/SillyTavern-Nemo-Templates)
55
+ If you experiment with this, please let me know how it goes! The conversation on how to properly implement Mistral is still ongoing.
56
+
57
  ## Merge Details
58
  Multi-stage SLERP merge, DARE-TIES'd together. Intended to be a general purpose "Best of Nemo" model for any fictional, creative use case. Inspired by hyper-merges like [Tiefighter](https://huggingface.co/KoboldAI/LLaMA2-13B-Tiefighter) and [Umbral Mind.](https://huggingface.co/Casual-Autopsy/L3-Umbral-Mind-RP-v2.0-8B)
59