Triangle104
/

L3.1-8B-Slush-v1.1-Q4_K_S-GGUF

@@ -1,5 +1,5 @@
 ---
-license: llama3
 license_name: llama3
 license_link: LICENSE
 library_name: transformers
@@ -22,6 +22,73 @@ base_model: crestf411/L3.1-8B-Slush-v1.1
 This model was converted to GGUF format from [`crestf411/L3.1-8B-Slush-v1.1`](https://huggingface.co/crestf411/L3.1-8B-Slush-v1.1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/crestf411/L3.1-8B-Slush-v1.1) for more details on the model.
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)
@@ -60,4 +127,4 @@ Step 3: Run inference through the main binary.
 or
 ```
 ./llama-server --hf-repo Triangle104/L3.1-8B-Slush-v1.1-Q4_K_S-GGUF --hf-file l3.1-8b-slush-v1.1-q4_k_s.gguf -c 2048
-```

 ---
+license: llama3.1
 license_name: llama3
 license_link: LICENSE
 library_name: transformers
 This model was converted to GGUF format from [`crestf411/L3.1-8B-Slush-v1.1`](https://huggingface.co/crestf411/L3.1-8B-Slush-v1.1) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
 Refer to the [original model card](https://huggingface.co/crestf411/L3.1-8B-Slush-v1.1) for more details on the model.
+---
+Model details:
+-
+Slush is a two-stage model trained with high LoRA dropout, where stage 1 is a pretraining continuation on the base model, aimed at boosting the model's creativity and writing capabilities. This is then merged into the instruction tune model, and stage 2 is a fine tuning step on top of this to further enhance its roleplaying capabilities and/or to repair any damage caused in the stage 1 merge.
+This is an initial experiment done on the at-this-point-infamous Llama 3.1 8B model, in an attempt to retain its smartness while addressing its abysmal lack of imagination/creativity. As always, feedback is welcome, and begone if you demand perfection.
+The second stage, like the Sunfall series, follows the Silly Tavern preset, so ymmv in particular if you use some other tool and/or preset.
+This update (v1.1) addresses some of the feedback from the first iteration by ramping down the training parameters, and also introduces a custom merge using mergekit.
+Parameter suggestions:
+-
+I did all my testing with temp 1, min-p 0.1, DRY 0.8. I enabled XTC at higher contexts.
+Training details:
+-
+    Stage 1 (continued pretraining)
+        Target: meta-llama/Llama-3.1-8B (resulting LoRA merged into meta-llama/Llama-3.1-8B-Instruct)
+        LoRA dropout 0.5 (motivation)
+        LoRA rank 64, alpha 128 (motivation)
+        LR cosine 4e-6
+        LoRA+ with LR Ratio: 15
+        Context size: 16384
+        Gradient accumulation steps: 4
+        Epochs: 1
+    Stage 2 (fine tune)
+        Target: Stage 1 model
+        LoRA dropout 0.5
+        LoRA rank 32, alpha 64
+        LR cosine 5e-6 (min 5e-7)
+        LoRA+ with LR Ratio: 15
+        Context size: 16384
+        Gradient accumulation steps: 4
+        Epochs: 2
+Merge Method
+-
+This model was merged using the TIES merge method using meta-llama/Llama-3.1-8B as a base.
+Configuration
+The following YAML configuration was used to produce this model:
+models:
+  - model: stage1-on-instruct
+    parameters:
+      weight: 1.5
+      density: 1
+  - model: stage2-on-stage1
+    parameters:
+      weight: 1.5
+      density: 1
+  - model: meta-llama/Llama-3.1-8B-Instruct
+    parameters:
+      weight: 1
+      density: 1
+merge_method: ties
+base_model: meta-llama/Llama-3.1-8B
+parameters:
+  weight: 1
+  density: 1
+  normalize: true
+  int8_mask: true
+tokenizer_source: meta-llama/Llama-3.1-8B-Instruct
+dtype: bfloat16
+---
 ## Use with llama.cpp
 Install llama.cpp through brew (works on Mac and Linux)
 or
 ```
 ./llama-server --hf-repo Triangle104/L3.1-8B-Slush-v1.1-Q4_K_S-GGUF --hf-file l3.1-8b-slush-v1.1-q4_k_s.gguf -c 2048
+```