son-of-man commited on
Commit
2ac88b4
1 Parent(s): e503fd0

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +88 -0
README.md ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model:
3
+ - Severian/Nexus-IKM-Mistral-Instruct-v0.2-7B
4
+ - son-of-man/HoloViolet-7B-test3
5
+ - alpindale/Mistral-7B-v0.2-hf
6
+ - localfultonextractor/Erosumika-7B-v3
7
+ library_name: transformers
8
+ tags:
9
+ - mergekit
10
+ - merge
11
+
12
+ ---
13
+ <h1 style="text-align: center">Twizzler-7B GGUF</h1>
14
+
15
+ <div style="display: flex; justify-content: center;">
16
+ <img src="https://huggingface.co/son-of-man/Twizzler-7B/resolve/main/twizz.jpg" alt="Header JPG">
17
+ </div>
18
+
19
+
20
+
21
+ I tried to expand [Erosumika](https://huggingface.co/localfultonextractor/Erosumika-7B-v3) with even more stimulation while keeping her brain intact.
22
+
23
+ The first key to this was to inject a small amount of a highly volatile [Holoviolet test merge](https://huggingface.co/son-of-man/HoloViolet-7B-test3) I made earlier, which is itself a mix of the highly creative but unhinged [Holodeck](https://huggingface.co/KoboldAI/Mistral-7B-Holodeck-1) and a [smart model](https://huggingface.co/GreenNode/GreenNode-mini-7B-multilingual-v1olet) by Greennode that I enjoyed.
24
+ The other special ingredient is [Nexus-IKM](https://huggingface.co/Severian/Nexus-IKM-Mistral-Instruct-v0.2-7B) which was trained on an internal knowledge map dataset that makes its line of reasoning often noticeably different from other mistral tunes.
25
+ It balances out the inconsistencies of Holoviolet while adding more creativity and logic at the same time.
26
+ Finally, I mixed in some base [Mistral-7B-v0.2](https://huggingface.co/alpindale/Mistral-7B-v0.2-hf) for higher context support and more intelligence. I went with the non-instruct version because I felt this merge should focus more on story writing capabilities than prompt following and I wanted to avoid GPT-isms like bonds and journeys as much as possible.
27
+
28
+ All in all this merge has a very distinct writing style that focuses less on flowery language and more on interesting ideas and interactions. It can go off the deep end and make lots of stupid mistakes sometimes, but it can also output some really good stuff if you're lucky.
29
+
30
+ # Prompts and settings
31
+
32
+ I recommend simple formats like Alpaca and not giving it too many instructions to get confused by. It is a 7B after all.
33
+
34
+ As for settings, I enjoy using dynamic temperature 1 to 5 with a min P of 0.1 and 0.95 typical P.
35
+
36
+ # Details
37
+
38
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
39
+
40
+ ## Merge Details
41
+ ### Merge Method
42
+
43
+ This model was merged using the [task arithmetic](https://arxiv.org/abs/2212.04089) merge method using [alpindale/Mistral-7B-v0.2-hf](https://huggingface.co/alpindale/Mistral-7B-v0.2-hf) as a base.
44
+
45
+ ### Models Merged
46
+
47
+ The following models were included in the merge:
48
+ * [Severian/Nexus-IKM-Mistral-Instruct-v0.2-7B](https://huggingface.co/Severian/Nexus-IKM-Mistral-Instruct-v0.2-7B)
49
+ * [son-of-man/HoloViolet-7B-test3](https://huggingface.co/son-of-man/HoloViolet-7B-test3)
50
+ * [localfultonextractor/Erosumika-7B-v3](https://huggingface.co/localfultonextractor/Erosumika-7B-v3)
51
+
52
+ ### Configuration
53
+
54
+ The following YAML configuration was used to produce this model:
55
+
56
+ ```yaml
57
+ base_model:
58
+ model:
59
+ path: alpindale/Mistral-7B-v0.2-hf
60
+ dtype: bfloat16
61
+ merge_method: task_arithmetic
62
+ slices:
63
+ - sources:
64
+ - layer_range: [0, 32]
65
+ model:
66
+ model:
67
+ path: alpindale/Mistral-7B-v0.2-hf
68
+ parameters:
69
+ weight: 0.3
70
+ - layer_range: [0, 32]
71
+ model:
72
+ model:
73
+ path: son-of-man/HoloViolet-7B-test3
74
+ parameters:
75
+ weight: 0.2
76
+ - layer_range: [0, 32]
77
+ model:
78
+ model:
79
+ path: localfultonextractor/Erosumika-7B-v3
80
+ parameters:
81
+ weight: 0.3
82
+ - layer_range: [0, 32]
83
+ model:
84
+ model:
85
+ path: Severian/Nexus-IKM-Mistral-Instruct-v0.2-7B
86
+ parameters:
87
+ weight: 0.2
88
+ ```