README.md · Triangle104/MN-Chunky-Lotus-12B-Q5_K

MN-Chunky-Lotus-12B-Q5_K_S-GGUF / README.md

Triangle104

Update README.md

34b3358 verified 3 days ago

preview code

raw

history blame contribute delete

4.36 kB

	---
	license: cc-by-4.0
	language:
	- en
	base_model: FallenMerick/MN-Chunky-Lotus-12B
	library_name: transformers
	tags:
	- storywriting
	- text adventure
	- creative
	- story
	- writing
	- fiction
	- roleplaying
	- rp
	- mergekit
	- merge
	- llama-cpp
	- gguf-my-repo
	---

	# Triangle104/MN-Chunky-Lotus-12B-Q5_K_S-GGUF
	This model was converted to GGUF format from [`FallenMerick/MN-Chunky-Lotus-12B`](https://huggingface.co/FallenMerick/MN-Chunky-Lotus-12B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
	Refer to the [original model card](https://huggingface.co/FallenMerick/MN-Chunky-Lotus-12B) for more details on the model.

	---
	Model details:
	-
	I had originally planned to use this model for future/further merges, but decided to go ahead and release it since it scored rather high on my local EQ Bench testing (79.58 w/ 100% parsed @ 8-bit).
	Bear in mind that most models tend to score a bit higher on my own local tests as compared to their posted scores. Still, its the highest score I've personally seen from all the models I've tested.
	Its a decent model, with great emotional intelligence and acceptable adherence to various character personalities. It does a good job at roleplaying despite being a bit bland at times.

	Overall, I like the way it writes, but it has a few formatting issues that show up from time to time, and it has an uncommon tendency to paste walls of character feelings/intentions at the end of some outputs without any prompting. This is something I hope to correct with future iterations.

	This is a merge of pre-trained language models created using mergekit.

	Merge Method
	-
	This model was merged using the TIES merge method.

	Models Merged
	-
	The following models were included in the merge:

	Epiculous/Violet_Twilight-v0.2
	nbeerbower/mistral-nemo-gutenberg-12B-v4
	flammenai/Mahou-1.5-mistral-nemo-12B

	Configuration
	-
	The following YAML configuration was used to produce this model:

	models:
	- model: Epiculous/Violet_Twilight-v0.2
	parameters:
	weight: 1.0
	density: 1.0
	- model: nbeerbower/mistral-nemo-gutenberg-12B-v4
	parameters:
	weight: 1.0
	density: 0.54
	- model: flammenai/Mahou-1.5-mistral-nemo-12B
	parameters:
	weight: 1.0
	density: 0.26
	merge_method: ties
	base_model: TheDrummer/Rocinante-12B-v1.1
	parameters:
	normalize: true
	dtype: bfloat16

	The idea behind this recipe was to take the long-form writing capabilities of Gutenberg, curtail it a bit with the very short output formatting of Mahou, and use Violet Twilight as an extremely solid roleplaying foundation underneath.
	Rocinante is used as the base model in this merge in order to really target the delta weights from Gutenberg, since those seemed to have the highest impact on the resulting EQ of the model.

	Special shoutout to @matchaaaaa for helping with testing, and for all the great model recommendations. Also, for just being an all around great person who's really inspired and motivated me to continue merging and working on models.

	---
	## Use with llama.cpp
	Install llama.cpp through brew (works on Mac and Linux)

	```bash
	brew install llama.cpp

	```
	Invoke the llama.cpp server or the CLI.

	### CLI:
	```bash
	llama-cli --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q5_K_S-GGUF --hf-file mn-chunky-lotus-12b-q5_k_s.gguf -p "The meaning to life and the universe is"
	```

	### Server:
	```bash
	llama-server --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q5_K_S-GGUF --hf-file mn-chunky-lotus-12b-q5_k_s.gguf -c 2048
	```

	Note: You can also use this checkpoint directly through the [usage steps](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#usage) listed in the Llama.cpp repo as well.

	Step 1: Clone llama.cpp from GitHub.
	```
	git clone https://github.com/ggerganov/llama.cpp
	```

	Step 2: Move into the llama.cpp folder and build it with `LLAMA_CURL=1` flag along with other hardware-specific flags (for ex: LLAMA_CUDA=1 for Nvidia GPUs on Linux).
	```
	cd llama.cpp && LLAMA_CURL=1 make
	```

	Step 3: Run inference through the main binary.
	```
	./llama-cli --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q5_K_S-GGUF --hf-file mn-chunky-lotus-12b-q5_k_s.gguf -p "The meaning to life and the universe is"
	```
	or
	```
	./llama-server --hf-repo Triangle104/MN-Chunky-Lotus-12B-Q5_K_S-GGUF --hf-file mn-chunky-lotus-12b-q5_k_s.gguf -c 2048
	```