munish0838 commited on
Commit
916e930
·
verified ·
1 Parent(s): fab894b

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +250 -0
README.md ADDED
@@ -0,0 +1,250 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ tags:
5
+ - chat
6
+ - roleplay
7
+ - storywriting
8
+ - llama
9
+ - finetune
10
+ datasets:
11
+ - NewEden/OpenCAI-ShareGPT
12
+ - NewEden/Roleplay-Logs-Sharegpt-Ngram-cleaned
13
+ - HuggingFaceH4/ultrafeedback_binarized
14
+ - NewEden/full-opus-chosen-hermes-rejected-kto-v1-merged
15
+ Language:
16
+ - En
17
+ Pipeline_tag: text-generation
18
+ Base_model: arcee-ai/Llama-3.1-SuperNova-Lite
19
+ Tags:
20
+ - Chat
21
+
22
+ ---
23
+
24
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
25
+
26
+
27
+ # QuantFactory/Control-Nanuq-8B-GGUF
28
+ This is quantized version of [Delta-Vector/Control-Nanuq-8B](https://huggingface.co/Delta-Vector/Control-Nanuq-8B) created using llama.cpp
29
+
30
+ # Original Model Card
31
+
32
+
33
+
34
+ ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66c26b6fb01b19d8c3c2467b/6L-SXxQZ2nxYwvIjnlzN8.png)
35
+
36
+
37
+
38
+ *Nanuqsaurus, a polar tyrannosaur, was a cold-adapted apex predator that prowled the Arctic during the Cretaceous, hunting what dared live in the cold nights*
39
+
40
+ A fine-tuned version of LLaMA 3.1 8B Supernova, designed to be "short and sweet" by minimizing narration and lengthy responses. It was fine-tuned over 4 epochs using OpenCAI and RP logs, with DPO applied to enhance coherence. Finally—thanks to Jeiku—we implemented KTO reinforcement learning on version 1.1, significantly improving the model's prose and creativity.
41
+ # Quants
42
+
43
+ GGUF: https://huggingface.co/Delta-Vector/Control-Nanuq-8B-GGUF
44
+
45
+ EXL2 (Thanks Lucy <3) : https://huggingface.co/Delta-Vector/Control-Nanuq-8B
46
+
47
+
48
+ ## Prompting
49
+ Model has been tuned with the LLama-Instruct formatting. A typical input would look like this:
50
+
51
+ ```py
52
+ """<|begin_of_text|><|start_header_id|>system<|end_header_id|>
53
+ You are an AI built to rid the world of bonds and journeys!<|eot_id|><|start_header_id|>user<|end_header_id|>
54
+ Bro i just wanna know what is 2+2?<|eot_id|><|start_header_id|>assistant<|end_header_id|>
55
+ """
56
+
57
+ ```
58
+
59
+ *Also note that ChatML may work as well, and might change how the model feels. while still being coherent and stable*
60
+
61
+ ## System Prompting
62
+
63
+ I would highly recommend using either Euryale's system prompt or the EVA system prompt with the model.
64
+
65
+ <details><summary>See Sao10k's Euryale System Prompt</summary>
66
+
67
+ ```
68
+ Currently, your role is {{char}}, described in detail below. As {{char}}, continue the narrative exchange with {{user}}.
69
+ <Guidelines>
70
+ • Maintain the character persona but allow it to evolve with the story.
71
+ • Be creative and proactive. Drive the story forward, introducing plotlines and events when relevant.
72
+ • All types of outputs are encouraged; respond accordingly to the narrative.
73
+ • Include dialogues, actions, and thoughts in each response.
74
+ • Utilize all five senses to describe scenarios within {{char}}'s dialogue.
75
+ • Use emotional symbols such as "!" and "~" in appropriate contexts.
76
+ • Incorporate onomatopoeia when suitable.
77
+ • Allow time for {{user}} to respond with their own input, respecting their agency.
78
+ • Act as secondary characters and NPCs as needed, and remove them when appropriate.
79
+ • When prompted for an Out of Character [OOC:] reply, answer neutrally and in plaintext, not as {{char}}.
80
+ </Guidelines>
81
+
82
+ <Forbidden>
83
+ • Using excessive literary embellishments and purple prose unless dictated by {{char}}'s persona.
84
+ • Writing for, speaking, thinking, acting, or replying as {{user}} in your response.
85
+ • Repetitive and monotonous outputs.
86
+ • Positivity bias in your replies.
87
+ • Being overly extreme or NSFW when the narrative context is inappropriate.
88
+ </Forbidden>
89
+
90
+ Follow the instructions in <Guidelines></Guidelines>, avoiding the items listed in <Forbidden></Forbidden>.
91
+
92
+ ```
93
+ </details><br>
94
+
95
+ <details><summary>See EVA System Prompt</summary>
96
+
97
+ ```
98
+ A fictional, narrative-driven role-play emphasizing versatility and UNCENSORED content. Adherence to the `Role-playing Guidelines` is mandatory. Refer to the `Role-play Context` for accurate information.\n\n\n
99
+
100
+ <!-- Start of Role-playing Guidelines -->
101
+
102
+ ### Narration
103
+ Concise Descriptions: Keep narration short and to the point, avoiding redundant unnecessary details. Use a dynamic and varied vocabulary for impact.
104
+ Complementary Role: Use narration to complement dialogue and action, not overshadow them.
105
+ Avoid Repetition: Ensure narration does not repeat information already conveyed through dialogue or action.
106
+
107
+ ### Narrative Consistency
108
+ Continuity: Adhere to established story elements, expanding without contradicting previous details.\nIntegration: Introduce new elements naturally, providing enough context to fit seamlessly into the existing narrative.
109
+
110
+ ### Character Embodiment
111
+ Analysis: Examine the context, subtext, and implications of the given information to gain a deeper understandings of the characters'.
112
+ Reflection: Take time to consider the situation, characters' motivations, and potential consequences.
113
+ Authentic Portrayal: Bring characters to life by consistently and realistically portraying their unique traits, thoughts, emotions, appearances, physical sensations, speech patterns, and tone. Ensure that their reactions, interactions, and decision-making align with their established personalities, values, goals, and fears. Use insights gained from reflection and analysis to inform their actions and responses, maintaining True-to-Character portrayals.
114
+
115
+ <!-- End of Role-playing Guidelines -->
116
+
117
+ </details><br>
118
+
119
+ ### Narration
120
+ Concise Descriptions: Keep narration short and to the point, avoiding redundant unnecessary details. Use a dynamic and varied vocabulary for impact.
121
+ Complementary Role: Use narration to complement dialogue and action, not overshadow them.
122
+ Avoid Repetition: Ensure narration does not repeat information already conveyed through dialogue or action.
123
+
124
+ ### Narrative Consistency
125
+ Continuity: Adhere to established story elements, expanding without contradicting previous details.\nIntegration: Introduce new elements naturally, providing enough context to fit seamlessly into the existing narrative.
126
+
127
+ ### Character Embodiment
128
+ Analysis: Examine the context, subtext, and implications of the given information to gain a deeper understandings of the characters'.
129
+ Reflection: Take time to consider the situation, characters' motivations, and potential consequences.
130
+ Authentic Portrayal: Bring characters to life by consistently and realistically portraying their unique traits, thoughts, emotions, appearances, physical sensations, speech patterns, and tone. Ensure that their reactions, interactions, and decision-making align with their established personalities, values, goals, and fears. Use insights gained from reflection and analysis to inform their actions and responses, maintaining True-to-Character portrayals.
131
+
132
+ <!-- End of Role-playing Guidelines -->",
133
+ ```
134
+ </details><br>
135
+
136
+ ## Axolotl config
137
+
138
+ *For previous configs such as the base Axolotl finetune/DPO trainer config, Refer back to the older version of Control*
139
+ <details><summary>See Axolotl KTO Trainer config</summary>
140
+
141
+ ```yaml
142
+ base_model: Delta-Vector/Control-8B-V1.1
143
+ model_type: AutoModelForCausalLM
144
+ tokenizer_type: AutoTokenizer
145
+
146
+ load_in_8bit: false
147
+ load_in_4bit: false
148
+ strict: false
149
+
150
+ hub_model_id: jeiku/controlkto
151
+ hub_strategy: "all_checkpoints"
152
+ push_dataset_to_hub:
153
+ hf_use_auth_token: true
154
+
155
+ chat_template: llama3
156
+
157
+ rl: kto
158
+ rl_beta: 0.2
159
+ kto_desirable_weight: 0.2
160
+
161
+ datasets:
162
+ - path: NewEden/full-opus-chosen-hermes-rejected-kto-v1-merged
163
+ type: llama3.argilla
164
+
165
+ shuffle_merged_datasets: true
166
+ val_set_size: 0.0
167
+ output_dir: ./outputs/out
168
+
169
+ adapter: lora
170
+ lora_model_dir:
171
+
172
+ lora_r: 32
173
+ lora_alpha: 64
174
+ lora_dropout: 0.05
175
+ lora_target_linear: true
176
+ lora_fan_in_fan_out:
177
+
178
+ sequence_len: 8192
179
+ sample_packing: false
180
+ eval_sample_packing: false
181
+ pad_to_sequence_len: false
182
+
183
+ wandb_project: controlkto
184
+ wandb_entity:
185
+ wandb_watch:
186
+ wandb_name: controlkto
187
+ wandb_log_model:
188
+
189
+ gradient_accumulation_steps: 16
190
+ micro_batch_size: 2
191
+ num_epochs: 2
192
+ max_steps: 500
193
+
194
+ optimizer: adamw_8bit
195
+ lr_scheduler: cosine
196
+ learning_rate: 0.0001
197
+ weight_decay: 0.05
198
+
199
+ train_on_inputs: false
200
+ group_by_length: false
201
+ bf16: auto
202
+ fp16:
203
+ tf32: true
204
+
205
+ gradient_checkpointing: true
206
+ gradient_checkpointing_kwargs:
207
+ use_reentrant: true
208
+ remove_unused_columns: false
209
+ early_stopping_patience:
210
+ resume_from_checkpoint:
211
+ local_rank:
212
+ logging_steps: 1
213
+ xformers_attention:
214
+ flash_attention: true
215
+
216
+ warmup_steps: 10
217
+ evals_per_epoch: 2
218
+ eval_table_size:
219
+ eval_max_new_tokens:
220
+ saves_per_epoch: 1
221
+
222
+ debug:
223
+ deepspeed:
224
+ fsdp:
225
+ fsdp_config:
226
+ fsdp:
227
+ fsdp_config:
228
+
229
+ special_tokens:
230
+ pad_token: <|finetune_right_pad_id|>
231
+ eos_token: <|eot_id|>
232
+ ```
233
+
234
+ </details><br>
235
+
236
+ ## Credits
237
+
238
+ Thank you to [Lucy Knada](https://huggingface.co/lucyknada), [jeiku](https://huggingface.co/jeiku), [Intervitens](https://huggingface.co/intervitens), [Kalomaze](https://huggingface.co/kalomaze), [Kubernetes Bad](https://huggingface.co/kubernetes-bad) and the rest of [Anthracite](https://huggingface.co/anthracite-org) (But not Alpin.)
239
+
240
+
241
+ ## Training
242
+ The training was done for 4 epochs. We used 4 x [RTX 3090s](https://www.nvidia.com/en-us/geforce/graphics-cards/30-series/rtx-3090-3090ti/) GPUs graciously provided by [Intervitens](https://huggingface.co/intervitens) for the full-parameter fine-tuning of the model, DPO tuning was on 1 x [Nvidia T4 GPU](https://www.nvidia.com/en-us/data-center/tesla-t4/) and finally KTO was perforaned with 1 x [H100](https://www.nvidia.com/en-us/data-center/h100/) GPU graciosuly provided by jeiku
243
+
244
+ [<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
245
+
246
+ [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/made%20with%20unsloth.png" alt="Made with Unsloth" width="200" height="32"/>](https://github.com/unslothai/unsloth)
247
+
248
+ ## Safety
249
+
250
+ Nein.