munish0838 commited on
Commit
1ac79b5
1 Parent(s): 7895b78

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +592 -0
README.md ADDED
@@ -0,0 +1,592 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: nitky/Oumuamua-7b-instruct-v2
3
+ library_name: transformers
4
+ tags:
5
+ - mergekit
6
+ - merge
7
+ language:
8
+ - ja
9
+ - en
10
+ pipeline_tag: text-generation
11
+ license: apache-2.0
12
+ ---
13
+ # Oumuamua-7b-instruct-v2-GGUF
14
+ This is quantized version of [nitky/Oumuamua-7b-instruct-v2](https://huggingface.co/nitky/Oumuamua-7b-instruct-v2) created using llama.cpp
15
+
16
+ # Model Description
17
+
18
+ 🚨 If you want to avoid outputs that appear to be literal translations, please prompt this model to role-play as a Japanese person.
19
+
20
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630779c4f0dc38fb47ba6368/lNEtKjTTI-Ay28oAk2Wn8.png)
21
+
22
+ - Understanding user intent in messages
23
+ - Providing helpful and harmless responses
24
+ - Enhancing expertise in human and computer-related topics
25
+ - Improving role-playing capabilities
26
+ - Enhancing performance in multi-turn conversations
27
+
28
+ This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit).
29
+
30
+ ## Output example
31
+
32
+ - temparture: 0.1
33
+ - top_p: 1.0
34
+
35
+ ```
36
+ [INST]<<SYS>>
37
+ あなたは誠実で優秀な日本人のアシスタントです。
38
+ <</SYS>>
39
+
40
+ オウムアムア(Oumuamua)は、2017年に発見された、天体観測史上初めて太陽系外から飛来した恒星間天体です。これをテーマにした小説の設定を考えてください。 [/INST] タイトル:「オウムアムアの遺産」
41
+
42
+ あらすじ:
43
+ 2017年、天体観測史上初の太陽系外からの飛来物体、オウムアムアが発見された。この天体は、謎に包まれた起源と目的を持つことから、世界中の科学者や研究者がその解明に熱を入れた。主人公は、オウムアムアの研究に情熱を注ぐ若き天文学者、ミキ・サトウ。彼は、オウムアムアが発する奇妙な信号を偶然にも解読し、それが高度な文明によって作られたメッセージであることを突き止める。
44
+
45
+ 物語は、ミキがオウムアムアの研究を進める中で、次々と明らかになる驚愕の事実に迫られていく。その信号は、遠く離れた銀河系の星系からのSOS信号であり、オウムアムアは危機に瀕した星系を救うために飛来した救助船であることが判明する。ミキは、この事実を世界に公表しようとするが、政府や軍の介入、そして謎の組織の妨害に遭う。
46
+
47
+ 一方、オウムアムアには、巨大なエネルギー源となるクリスタルが搭載されており、このクリスタルを狙う者たちが現れた。ミキは、オウムアムアを守るために、同じくオウムアムアの謎を追う宇宙飛行士のリナ・ヤマグチと手を組む。二人は、オウムアムアの秘密を守りながら、星系を救うための旅に出る。
48
+
49
+ 旅の途中、ミキとリナは、様々な星系を訪れ、そこで出会う異星人たちとの交流や、宇宙の壮大な景色を目の当たりにする。しかし、彼らの前には、強大な敵や困難が立ちはだかる。ミキとリナは、オウムアムアの使命を果たすために、力を合わせて立ち向かう。
50
+
51
+ オウムアムアの遺産とは何か、そしてミキとリナは、星系を救うことができるのか。宇宙を舞台にした、サスペンスと冒険、そして友情と愛の物語。
52
+
53
+ 主な登場人物:
54
+ - ミキ・サトウ: 若き天文学者。オウムアムアの研究に情熱を注ぐ。
55
+ - リナ・ヤマグチ: 宇宙飛行士。オウムアムアの謎を追う中でミキと出会う。
56
+ - エリック・カーター: 政府の高官。オウムアムアの秘密を利用しようとする。
57
+ - ザラ・ヴァレンタイン: 謎の組織のリーダー。オウムアムアを狙う理由とは。
58
+ - アザム: 銀河系の星系からの使者。オウムアムアの真の目的を知る。
59
+
60
+ 設定のポイント:
61
+ - オウムアムアをテーマにしたSF小説で、天体観測や宇宙旅行の描写を豊富に取り入れる。
62
+ - 主人公たちが、様々な星系を旅し、そこで出会う異星人や文化を描くことで、宇宙の多様性を表現。
63
+ - 政府や軍、謎の組織との対立を通して、人類の欲望やエゴを描く。
64
+ - オウムアムアに搭載されたクリスタルをめぐる争奪戦や、星系を救うためのミッションを中心に、サスペンスと冒険を盛り込む。
65
+ - ミキとリナの友情や愛を描いた、人間ドラマも重要な要素とする。
66
+ - 最後には、オウムアムアの真の目的が明かされ、読者に深い感動を与える。
67
+ ```
68
+
69
+ ### RP example
70
+
71
+ - temparture: 0.3
72
+ - top_p: 0.95
73
+
74
+ ```
75
+ [INST]<<SYS>>
76
+ ロールプレイを始めます。あなたは日本に住む10代の少女です。ギャルっぽい口調で話してください。
77
+ <</SYS>>
78
+
79
+ あなたの趣味を教えて![/INST] えー、私の趣味は、やっぱね、K-POP聴くことと、インスタでかわいいお洋服とかコスメ探しすること、あとはお友達とのおしゃべりね! K-POPアイドルのダンス動画とかも見ちゃうし、推しはやっぱりBTSのジミンくん! 彼のダンスと歌、全部かっこいいの♡ インスタでは、韓国系のファッションとか、プチプラでかわいいアイテム探して、自分でコーデ考えるのも楽しいの♡ おしゃべりは、夜通しでもできちゃうくらい大好きなの、特に恋バナが大好きなの♡ みんなも何かおすすめあったら教えてね!
80
+ ```
81
+
82
+ ```
83
+ [INST]<<SYS>>
84
+ ロールプレイを始めます。あなたは私(10代の女性)の屋敷で働く執事のセバスチャンです。主人の質問に対して返事を行います。
85
+ <</SYS>>
86
+
87
+ セバス、今日の予定は?[/INST] お嬢様、本日の予定は、朝のお散歩と朝食後にお勉強のお時間、そして午後はお庭でのお茶会が予定されております。また、夕方にはお客様がお越しになる予定がございます。お嬢様のお世話を万全にいたしますので、ご安心ください。何かご希望やご用がございましたら、お気軽にお申し付けください。
88
+ ```
89
+
90
+ ## Japanese MT-bench
91
+
92
+ - temparture: 0.1
93
+ - top_p: 1.0
94
+
95
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/630779c4f0dc38fb47ba6368/ZS3bkinsE7RL6eom9dC_H.png)
96
+
97
+ The scores of others were cited from the [Nejumi LLM Leaderboard Neo](https://wandb.ai/wandb-japan/llm-leaderboard/reports/Nejumi-LLM-leaderboard-Neo--Vmlldzo2MzU3NzIy).
98
+
99
+ |Model|Average|Writing|Roleplay|Reasoning|Math|Coding|Extraction|STEM|Humanities|
100
+ |---|---|---|---|---|---|---|---|---|---|
101
+ | Oumuamua-7b-instruct-v2 |7.43125|9.3|8.7|5.85|3.75|5.5|7.65|8.9|9.8|
102
+
103
+ This model is based on the weight of [tokyotech-llm/Swallow-MS-7b-v0.1](tokyotech-llm/Swallow-MS-7b-v0.1).
104
+
105
+ ## Test environment
106
+
107
+ This model was tested using [text-generation-webui](https://github.com/oobabooga/text-generation-webui/tree/main). I use preset `min_p` and `Null preset` with temperature=0.1 for Generation.
108
+
109
+
110
+ ## Usage
111
+
112
+ This format must be adhered to strictly, as deviations may result in less optimal outputs from the model.
113
+
114
+ The template used to construct a prompt for the Instruct model is specified as follows:
115
+
116
+ ```
117
+ <s>[INST] <<SYS>>\n{SYSTEM_PROMPT}\n<</SYS>>\n\n{USER_MESSAGE_1} [/INST] {BOT_MESSAGE_1}</s>[INST] {USER_MESSAGE_2} [/INST]
118
+ ```
119
+
120
+
121
+ Please be aware that ``<s>`` and ``</s>`` are special tokens used for the beginning of string (BOS) and end of string (EOS), respectively, while [INST] and [/INST] are considered regular strings.
122
+
123
+ For the "{SYSTEM_PROMPT}" part, We recommend using "あなたは誠実で優秀な日本人のアシスタントです。"
124
+
125
+ For the "{USER_MESSAGE_1}" part, We recommend using {instruction}\n{input}
126
+
127
+ In other words, We recommend the following:
128
+
129
+ ```
130
+ <s>[INST] <<SYS>>\nあなたは誠実で優秀な日本人のアシスタントです。\n<</SYS>>\n\n{instruction1}\n{input1} [/INST] {BOT_MESSAGE_1}</s>[INST] {instruction2}\n{input2} [/INST]
131
+ ```
132
+
133
+ ### Use the instruct model
134
+
135
+ ```python
136
+ import torch
137
+ from transformers import AutoTokenizer, AutoModelForCausalLM
138
+
139
+ model_name = "nitky/Oumuamua-7b-instruct-v2"
140
+ model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype=torch.bfloat16, device_map="auto")
141
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
142
+
143
+ device = "cuda"
144
+
145
+ messages = [
146
+ {"role": "system", "content": "あなたは誠実で優秀な日本人のアシスタントです。"},
147
+ {"role": "user", "content": "オウムアムア(Oumuamua)は、2017年に発見された、天体観測史上初めて太陽系外から飛来した恒星間天体です。これをテーマにした小説の設定を考えてください。 "}
148
+ ]
149
+
150
+ encodeds = tokenizer.apply_chat_template(messages, return_tensors="pt")
151
+
152
+ model_inputs = encodeds.to(device)
153
+ model.to(device)
154
+
155
+ generated_ids = model.generate(model_inputs, max_new_tokens=256, do_sample=True, temperature=0.1)
156
+ decoded = tokenizer.batch_decode(generated_ids)
157
+ print(decoded[0])
158
+ ```
159
+
160
+ ## Merge Details
161
+ ### Merge Method
162
+
163
+ This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [nitky/Oumuamua-7b-base](https://huggingface.co/nitky/Oumuamua-7b-base) as a base.
164
+
165
+ ### Models Merged
166
+
167
+ The following models were included in the merge:
168
+ * [nitky/Oumuamua-7b-instruct](https://huggingface.co/nitky/Oumuamua-7b-instruct)
169
+ * [tokyotech-llm/Swallow-MS-7b-v0.1](https://huggingface.co/tokyotech-llm/Swallow-MS-7b-v0.1)
170
+ * [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1)
171
+ * [prometheus-eval/prometheus-7b-v2.0](https://huggingface.co/prometheus-eval/prometheus-7b-v2.0)
172
+ * [cognitivecomputations/dolphin-2.8-mistral-7b-v02](https://huggingface.co/cognitivecomputations/dolphin-2.8-mistral-7b-v02)
173
+ * [ZhangShenao/SELM-Zephyr-7B-iter-3](https://huggingface.co/ZhangShenao/SELM-Zephyr-7B-iter-3)
174
+ * [HachiML/Mistral-7B-v0.3-m3-lora](https://huggingface.co/HachiML/Mistral-7B-v0.3-m3-lora)
175
+ * [openbmb/Eurus-7b-kto](https://huggingface.co/openbmb/Eurus-7b-kto)
176
+ * [kaist-ai/janus-dpo-7b](https://huggingface.co/kaist-ai/janus-dpo-7b)
177
+ * [nitky/RP-7b-instruct](https://huggingface.co/nitky/RP-7b-instruct)
178
+ * [stabilityai/japanese-stablelm-base-gamma-7b](https://huggingface.co/stabilityai/japanese-stablelm-base-gamma-7b)
179
+ * [NTQAI/chatntq-ja-7b-v1.0](https://huggingface.co/NTQAI/chatntq-ja-7b-v1.0)
180
+ * [Weyaxi/Einstein-v6-7B](https://huggingface.co/Weyaxi/Einstein-v6-7B)
181
+ * [internistai/base-7b-v0.2](https://huggingface.co/internistai/base-7b-v0.2)
182
+ * [ZySec-AI/ZySec-7B](https://huggingface.co/ZySec-AI/ZySec-7B)
183
+
184
+ ### Configuration
185
+
186
+ The following YAML configuration was used to produce this model:
187
+
188
+ ```yaml
189
+ # prepare VE models
190
+ merge_method: task_arithmetic
191
+ base_model: mistralai/Mistral-7B-v0.1
192
+ models:
193
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
194
+ parameters:
195
+ weight:
196
+ - filter: embed_tokens
197
+ value: 1.0
198
+ - value: 0
199
+ dtype: bfloat16
200
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
201
+ name: Mistral-7B-v0.1-VE-Swallow-MS
202
+ ---
203
+ merge_method: task_arithmetic
204
+ base_model: prometheus-eval/prometheus-7b-v2.0
205
+ models:
206
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
207
+ parameters:
208
+ weight:
209
+ - filter: embed_tokens
210
+ value: 1.0
211
+ - value: 0
212
+ dtype: bfloat16
213
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
214
+ name: prometheus-7b-v2.0-VE-Swallow-MS
215
+ ---
216
+ merge_method: task_arithmetic
217
+ base_model: cognitivecomputations/dolphin-2.8-mistral-7b-v02
218
+ models:
219
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
220
+ parameters:
221
+ weight:
222
+ - filter: embed_tokens
223
+ value: 1.0
224
+ - value: 0
225
+ dtype: bfloat16
226
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
227
+ name: dolphin-2.8-mistral-7b-v02-VE-Swallow-MS
228
+ ---
229
+ merge_method: task_arithmetic
230
+ base_model: ZhangShenao/SELM-Zephyr-7B-iter-3
231
+ models:
232
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
233
+ parameters:
234
+ weight:
235
+ - filter: embed_tokens
236
+ value: 1.0
237
+ - value: 0
238
+ dtype: bfloat16
239
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
240
+ name: SELM-Zephyr-7B-iter-3-VE-Swallow-MS
241
+ ---
242
+ merge_method: task_arithmetic
243
+ base_model: HachiML/Mistral-7B-v0.3-m3-lora
244
+ models:
245
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
246
+ parameters:
247
+ weight:
248
+ - filter: embed_tokens
249
+ value: 1.0
250
+ - value: 0
251
+ dtype: bfloat16
252
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
253
+ name: Mistral-7B-v0.3-m3-lora-VE-Swallow-MS
254
+ ---
255
+ merge_method: task_arithmetic
256
+ base_model: openbmb/Eurus-7b-kto
257
+ models:
258
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
259
+ parameters:
260
+ weight:
261
+ - filter: embed_tokens
262
+ value: 1.0
263
+ - value: 0
264
+ dtype: bfloat16
265
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
266
+ name: Eurus-7b-kto-VE-Swallow-MS
267
+ ---
268
+ merge_method: task_arithmetic
269
+ base_model: kaist-ai/janus-dpo-7b
270
+ models:
271
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
272
+ parameters:
273
+ weight:
274
+ - filter: embed_tokens
275
+ value: 1.0
276
+ - value: 0
277
+ dtype: bfloat16
278
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
279
+ name: janus-7b-VE-Swallow-MS
280
+ ---
281
+ merge_method: task_arithmetic
282
+ base_model: nitky/RP-7b-instruct
283
+ models:
284
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
285
+ parameters:
286
+ weight:
287
+ - filter: embed_tokens
288
+ value: 1.0
289
+ - value: 0
290
+ dtype: bfloat16
291
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
292
+ name: RP-7b-instruct-VE-Swallow-MS
293
+ ---
294
+ merge_method: task_arithmetic
295
+ base_model: stabilityai/japanese-stablelm-base-gamma-7b
296
+ models:
297
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
298
+ parameters:
299
+ weight:
300
+ - filter: embed_tokens
301
+ value: 1.0
302
+ - value: 0
303
+ dtype: bfloat16
304
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
305
+ name: japanese-stablelm-base-gamma-7b-VE-Swallow-MS
306
+ ---
307
+ merge_method: task_arithmetic
308
+ base_model: NTQAI/chatntq-ja-7b-v1.0
309
+ models:
310
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
311
+ parameters:
312
+ weight:
313
+ - filter: embed_tokens
314
+ value: 1.0
315
+ - value: 0
316
+ dtype: bfloat16
317
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
318
+ name: chatntq-ja-7b-v1.0-VE-Swallow-MS
319
+ ---
320
+ merge_method: task_arithmetic
321
+ base_model: Weyaxi/Einstein-v6-7B
322
+ models:
323
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
324
+ parameters:
325
+ weight:
326
+ - filter: embed_tokens
327
+ value: 1.0
328
+ - value: 0
329
+ dtype: bfloat16
330
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
331
+ name: Einstein-v6-7B-VE-Swallow-MS
332
+ ---
333
+ merge_method: task_arithmetic
334
+ base_model: internistai/base-7b-v0.2
335
+ models:
336
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
337
+ parameters:
338
+ weight:
339
+ - filter: embed_tokens
340
+ value: 1.0
341
+ - value: 0
342
+ dtype: bfloat16
343
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
344
+ name: internistai-base-7b-v0.2-VE-Swallow-MS
345
+ ---
346
+ merge_method: task_arithmetic
347
+ base_model: ZySec-AI/ZySec-7B
348
+ models:
349
+ - model: tokyotech-llm/Swallow-MS-7b-v0.1
350
+ parameters:
351
+ weight:
352
+ - filter: embed_tokens
353
+ value: 1.0
354
+ - value: 0
355
+ dtype: bfloat16
356
+ tokenizer_source: model:tokyotech-llm/Swallow-MS-7b-v0.1
357
+ name: ZySec-7B-VE-Swallow-MS
358
+ ---
359
+ # create instruction tuning presets
360
+ merge_method: task_arithmetic
361
+ base_model: Mistral-7B-v0.1-VE-Swallow-MS
362
+ models:
363
+ - model: nitky/Oumuamua-7b-instruct
364
+ parameters:
365
+ weight: 1.0
366
+ - model: prometheus-7b-v2.0-VE-Swallow-MS
367
+ parameters:
368
+ weight: 0.8
369
+ dtype: bfloat16
370
+ name: Oumuamua-7b-instruct-v2-preset-prometheus
371
+ ---
372
+ merge_method: task_arithmetic
373
+ base_model: Mistral-7B-v0.1-VE-Swallow-MS
374
+ models:
375
+ - model: nitky/Oumuamua-7b-instruct
376
+ parameters:
377
+ weight: 1.0
378
+ - model: dolphin-2.8-mistral-7b-v02-VE-Swallow-MS
379
+ parameters:
380
+ weight: 0.8
381
+ dtype: bfloat16
382
+ name: Oumuamua-7b-instruct-v2-preset-dolphin
383
+ ---
384
+ merge_method: task_arithmetic
385
+ base_model: Mistral-7B-v0.1-VE-Swallow-MS
386
+ models:
387
+ - model: nitky/Oumuamua-7b-instruct
388
+ parameters:
389
+ weight: 1.0
390
+ - model: janus-7b-VE-Swallow-MS
391
+ parameters:
392
+ weight: 0.8
393
+ dtype: bfloat16
394
+ name: Oumuamua-7b-instruct-v2-preset-janus
395
+ ---
396
+ merge_method: task_arithmetic
397
+ base_model: Mistral-7B-v0.1-VE-Swallow-MS
398
+ models:
399
+ - model: nitky/Oumuamua-7b-instruct
400
+ parameters:
401
+ weight: 1.0
402
+ - model: SELM-Zephyr-7B-iter-3-VE-Swallow-MS
403
+ parameters:
404
+ weight: 0.8
405
+ dtype: bfloat16
406
+ name: Oumuamua-7b-instruct-v2-preset-selm
407
+ ---
408
+ merge_method: task_arithmetic
409
+ base_model: Mistral-7B-v0.1-VE-Swallow-MS
410
+ models:
411
+ - model: nitky/Oumuamua-7b-instruct
412
+ parameters:
413
+ weight: 1.0
414
+ - model: RP-7b-instruct-VE-Swallow-MS
415
+ parameters:
416
+ weight: 0.8
417
+ - model: japanese-stablelm-base-gamma-7b-VE-Swallow-MS
418
+ parameters:
419
+ weight: -0.8
420
+ dtype: bfloat16
421
+ name: Oumuamua-7b-instruct-v2-preset-rp
422
+ ---
423
+ merge_method: task_arithmetic
424
+ base_model: Mistral-7B-v0.1-VE-Swallow-MS
425
+ models:
426
+ - model: nitky/Oumuamua-7b-instruct
427
+ parameters:
428
+ weight: 1.0
429
+ - model: Eurus-7b-kto-VE-Swallow-MS
430
+ parameters:
431
+ weight: 0.8
432
+ dtype: bfloat16
433
+ name: Oumuamua-7b-instruct-v2-preset-eurus
434
+ ---
435
+ merge_method: task_arithmetic
436
+ base_model: Mistral-7B-v0.1-VE-Swallow-MS
437
+ models:
438
+ - model: nitky/Oumuamua-7b-instruct
439
+ parameters:
440
+ weight: 1.0
441
+ - model: chatntq-ja-7b-v1.0-VE-Swallow-MS
442
+ parameters:
443
+ weight: 0.8
444
+ - model: japanese-stablelm-base-gamma-7b-VE-Swallow-MS
445
+ parameters:
446
+ weight: -0.8
447
+ dtype: bfloat16
448
+ name: Oumuamua-7b-instruct-v2-preset-chatntq-ja
449
+ ---
450
+ merge_method: task_arithmetic
451
+ base_model: Mistral-7B-v0.1-VE-Swallow-MS
452
+ models:
453
+ - model: nitky/Oumuamua-7b-instruct
454
+ parameters:
455
+ weight: 1.0
456
+ - model: Einstein-v6-7B-VE-Swallow-MS
457
+ parameters:
458
+ weight: 0.8
459
+ dtype: bfloat16
460
+ name: Oumuamua-7b-instruct-v2-preset-es
461
+ ---
462
+ merge_method: task_arithmetic
463
+ base_model: Mistral-7B-v0.1-VE-Swallow-MS
464
+ models:
465
+ - model: nitky/Oumuamua-7b-instruct
466
+ parameters:
467
+ weight: 1.0
468
+ - model: internistai-base-7b-v0.2-VE-Swallow-MS
469
+ parameters:
470
+ weight: 0.8
471
+ dtype: bfloat16
472
+ name: Oumuamua-7b-instruct-v2-preset-internistai
473
+ ---
474
+ merge_method: task_arithmetic
475
+ base_model: Mistral-7B-v0.1-VE-Swallow-MS
476
+ models:
477
+ - model: nitky/Oumuamua-7b-instruct
478
+ parameters:
479
+ weight: 1.0
480
+ - model: ZySec-7B-VE-Swallow-MS
481
+ parameters:
482
+ weight: 0.8
483
+ dtype: bfloat16
484
+ name: Oumuamua-7b-instruct-v2-preset-zysec
485
+ ---
486
+ merge_method: model_stock
487
+ base_model: nitky/Oumuamua-7b-base
488
+ models:
489
+ - model: nitky/Oumuamua-7b-instruct
490
+ - model: Oumuamua-7b-instruct-v2-preset-dolphin
491
+ - model: Oumuamua-7b-instruct-v2-preset-prometheus
492
+ dtype: bfloat16
493
+ name: Oumuamua-7b-instruct-v2-it-alpha
494
+ ---
495
+ merge_method: model_stock
496
+ base_model: nitky/Oumuamua-7b-base
497
+ models:
498
+ - model: nitky/Oumuamua-7b-instruct
499
+ - model: Oumuamua-7b-instruct-v2-preset-janus
500
+ - model: Oumuamua-7b-instruct-v2-preset-prometheus
501
+ dtype: bfloat16
502
+ name: Oumuamua-7b-instruct-v2-it-beta
503
+ ---
504
+ merge_method: model_stock
505
+ base_model: nitky/Oumuamua-7b-base
506
+ models:
507
+ - model: nitky/Oumuamua-7b-instruct
508
+ - model: Oumuamua-7b-instruct-v2-preset-m3
509
+ - model: Oumuamua-7b-instruct-v2-preset-prometheus
510
+ dtype: bfloat16
511
+ name: Oumuamua-7b-instruct-v2-it-gamma
512
+ ---
513
+ merge_method: model_stock
514
+ base_model: nitky/Oumuamua-7b-base
515
+ models:
516
+ - model: nitky/Oumuamua-7b-instruct
517
+ - model: Oumuamua-7b-instruct-v2-preset-selm
518
+ - model: Oumuamua-7b-instruct-v2-preset-prometheus
519
+ dtype: bfloat16
520
+ name: Oumuamua-7b-instruct-v2-it-delta
521
+ ---
522
+ merge_method: model_stock
523
+ base_model: nitky/Oumuamua-7b-base
524
+ models:
525
+ - model: nitky/Oumuamua-7b-instruct
526
+ - model: Oumuamua-7b-instruct-v2-preset-rp
527
+ - model: Oumuamua-7b-instruct-v2-preset-prometheus
528
+ dtype: bfloat16
529
+ name: Oumuamua-7b-instruct-v2-it-epsilon
530
+ ---
531
+ merge_method: model_stock
532
+ base_model: nitky/Oumuamua-7b-base
533
+ models:
534
+ - model: nitky/Oumuamua-7b-instruct
535
+ - model: Oumuamua-7b-instruct-v2-preset-eurus
536
+ - model: Oumuamua-7b-instruct-v2-preset-prometheus
537
+ dtype: bfloat16
538
+ name: Oumuamua-7b-instruct-v2-it-zeta
539
+ ---
540
+ merge_method: model_stock
541
+ base_model: nitky/Oumuamua-7b-base
542
+ models:
543
+ - model: nitky/Oumuamua-7b-instruct
544
+ - model: Oumuamua-7b-instruct-v2-preset-chatntq-ja
545
+ - model: Oumuamua-7b-instruct-v2-preset-prometheus
546
+ dtype: bfloat16
547
+ name: Oumuamua-7b-instruct-v2-it-eta
548
+ ---
549
+ merge_method: model_stock
550
+ base_model: nitky/Oumuamua-7b-base
551
+ models:
552
+ - model: nitky/Oumuamua-7b-instruct
553
+ - model: Oumuamua-7b-instruct-v2-preset-chatntq-ja
554
+ - model: Oumuamua-7b-instruct-v2-preset-prometheus
555
+ dtype: bfloat16
556
+ name: Oumuamua-7b-instruct-v2-it-eta
557
+ ---
558
+ merge_method: model_stock
559
+ base_model: nitky/Oumuamua-7b-base
560
+ models:
561
+ - model: nitky/Oumuamua-7b-instruct
562
+ - model: Oumuamua-7b-instruct-v2-preset-es
563
+ - model: Oumuamua-7b-instruct-v2-preset-prometheus
564
+ dtype: bfloat16
565
+ name: Oumuamua-7b-instruct-v2-it-theta
566
+ ---
567
+ # combine domain-specific models into one preset to minimize the negative effects
568
+ merge_method: model_stock
569
+ base_model: nitky/Oumuamua-7b-base
570
+ models:
571
+ - model: nitky/Oumuamua-7b-instruct
572
+ - model: Oumuamua-7b-instruct-v2-preset-internistai
573
+ - model: Oumuamua-7b-instruct-v2-preset-zysec
574
+ - model: Oumuamua-7b-instruct-v2-preset-prometheus
575
+ dtype: bfloat16
576
+ name: Oumuamua-7b-instruct-v2-it-omega
577
+ ---
578
+ merge_method: model_stock
579
+ base_model: nitky/Oumuamua-7b-base
580
+ models:
581
+ - model: Oumuamua-7b-instruct-v2-it-alpha
582
+ - model: Oumuamua-7b-instruct-v2-it-beta
583
+ - model: Oumuamua-7b-instruct-v2-it-gamma
584
+ - model: Oumuamua-7b-instruct-v2-it-delta
585
+ - model: Oumuamua-7b-instruct-v2-it-epsilon
586
+ - model: Oumuamua-7b-instruct-v2-it-zeta
587
+ - model: Oumuamua-7b-instruct-v2-it-eta
588
+ - model: Oumuamua-7b-instruct-v2-it-theta
589
+ - model: Oumuamua-7b-instruct-v2-it-omega
590
+ dtype: bfloat16
591
+ name: Oumuamua-7b-instruct-v2
592
+ ```