File size: 4,356 Bytes
e505f8d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
34b3358
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e505f8d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
---
license: cc-by-4.0
language:
- en
base_model: FallenMerick/MN-Chunky-Lotus-12B
library_name: transformers
tags:
- storywriting
- text adventure
- creative
- story
- writing
- fiction
- roleplaying
- rp
- mergekit
- merge
- llama-cpp
- gguf-my-repo
---

# Triangle104/MN-Chunky-Lotus-12B-Q5_K_S-GGUF
This model was converted to GGUF format from [`FallenMerick/MN-Chunky-Lotus-12B`](https://huggingface.co/FallenMerick/MN-Chunky-Lotus-12B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
Refer to the [original model card](https://huggingface.co/FallenMerick/MN-Chunky-Lotus-12B) for more details on the model.

---
Model details:
-
I had originally planned to use this model for future/further merges, but decided to go ahead and release it since it scored rather high on my local EQ Bench testing (79.58 w/ 100% parsed @ 8-bit).
Bear in mind that most models tend to score a bit higher on my own local tests as compared to their posted scores. Still, its the highest score I've personally seen from all the models I've tested.
Its a decent model, with great emotional intelligence and acceptable adherence to various character personalities. It does a good job at roleplaying despite being a bit bland at times.

Overall, I like the way it writes, but it has a few formatting issues that show up from time to time, and it has an uncommon tendency to paste walls of character feelings/intentions at the end of some outputs without any prompting. This is something I hope to correct with future iterations.

This is a merge of pre-trained language models created using mergekit.

Merge Method
-
This model was merged using the TIES merge method.

Models Merged
-
The following models were included in the merge:

    Epiculous/Violet_Twilight-v0.2
    nbeerbower/mistral-nemo-gutenberg-12B-v4
    flammenai/Mahou-1.5-mistral-nemo-12B

Configuration
-
The following YAML configuration was used to produce this model:

models:
  - model: Epiculous/Violet_Twilight-v0.2
    parameters:
      weight: 1.0
      density: 1.0
  - model: nbeerbower/mistral-nemo-gutenberg-12B-v4
    parameters:
      weight: 1.0
      density: 0.54
  - model: flammenai/Mahou-1.5-mistral-nemo-12B
    parameters:
      weight: 1.0
      density: 0.26
merge_method: ties
base_model: TheDrummer/Rocinante-12B-v1.1
parameters:
  normalize: true
dtype: bfloat16

The idea behind this recipe was to take the long-form writing capabilities of Gutenberg, curtail it a bit with the very short output formatting of Mahou, and use Violet Twilight as an extremely solid roleplaying foundation underneath.
Rocinante is used as the base model in this merge in order to really target the delta weights from Gutenberg, since those seemed to have the highest impact on the resulting EQ of the model.

Special shoutout to @matchaaaaa for helping with testing, and for all the great model recommendations. Also, for just being an all around great person who's really inspired and motivated me to continue merging and working on models.

---
## Use with llama.cpp
Install llama.cpp through brew (works on Mac and Linux)

```bash
brew install llama.cpp

```
Invoke the llama.cpp server or the CLI.

### CLI:
```bash
llama-cli --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q5_K_S-GGUF --hf-file mn-chunky-lotus-12b-q5_k_s.gguf -p "The meaning to life and the universe is"
```

### Server:
```bash
llama-server --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q5_K_S-GGUF --hf-file mn-chunky-lotus-12b-q5_k_s.gguf -c 2048
```

Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

Step 1: Clone llama.cpp from GitHub.
```
git clone https://github.com/ggerganov/llama.cpp
```

Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
```
cd llama.cpp && LLAMA_CURL=1 make
```

Step 3: Run inference through the main binary.
```
./llama-cli --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q5_K_S-GGUF --hf-file mn-chunky-lotus-12b-q5_k_s.gguf -p "The meaning to life and the universe is"
```
or 
```
./llama-server --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q5_K_S-GGUF --hf-file mn-chunky-lotus-12b-q5_k_s.gguf -c 2048
```