--- base_model: - 152334H/miqu-1-70b-sf - sophosympatheia/Midnight-Rose-70B-v2.0.3 - Sao10K/Euryale-1.3-L2-70B - Sao10K/WinterGoddess-1.4x-70B-L2 library_name: transformers tags: - mergekit - merge license: other --- ![Dusk-Miqu.png](Dusk-Miqu.png) A "dark" creative writing model with 32k context. Based off [miqu-1-70b](https://huggingface.co/miqudev/miqu-1-70b) but with greatly reduced "positivity" and "-isms". If you want happy endings, look elsewhere! This model **excels** at writing Dark/Grimdark fantasy (see examples below). # Model background This model is almost the same as [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B), but with @sophosympatheia's SLERP merge pattern: ```yaml parameters: t: - value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0] ``` which creates this truncated triangular distribution: ![Dark-Miqu-Distribution.png](https://cdn-uploads.huggingface.co/production/uploads/65995c45539c808e84c38bf1/guNF-5tEcKxeGVxyJcfCO.png) altered to use this truncated triangular distribution instead: ![Dark-Miqu-Distribution-2.png](https://cdn-uploads.huggingface.co/production/uploads/65995c45539c808e84c38bf1/62P3bxJkTgw_-6gqpqG2g.png) This keeps the first 16 and last 16 layers unaltered (which ties in with what people have found for the frankenmerge interleave patterns), and potentially fixes the ["poor grammar"](https://huggingface.co/jukofyork/Dark-Miqu-70B/discussions/2) problem some people are having with [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) (sadly I can't replicate this though...). Luckily this change also doesn't necessitate the recreation of the whole merge from scratch, and we can just use this: ```yaml merge_method: linear parameters: weight: 1.0 slices: - sources: - model: 152334H/miqu-1-70b-sf layer_range: [0, 16] - model: jukofyork/dark-miqu-70b layer_range: [0, 16] parameters: weight: 0 - sources: - model: jukofyork/dark-miqu-70b layer_range: [16, 64] - sources: - model: 152334H/miqu-1-70b-sf layer_range: [64, 80] - model: jukofyork/dark-miqu-70b layer_range: [64, 80] parameters: weight: 0 dtype: float16 tokenizer_source: model:miqu-1-70b-sf ``` # Prompting format Vicuna format is preferred: ``` USER: {prompt} ASSISTANT: ``` Mistral and Alpaca formats are also supported: ``` [INST] {prompt} [/INST] ``` ``` ### Instruction: {prompt} ### Response: ``` # Licence and usage restrictions [miqu-1-70b-sf](https://huggingface.co/152334H/miqu-1-70b-sf) is a dequantized version of the [miqu-1-70b](https://huggingface.co/miqudev/miqu-1-70b) model leaked from MistralAI. All miqu-derived models, including this merge, are suitable for non-commercial, personal use only. # Mergekit configuration The following YAML configuration was used to produce this model: ```yaml name: midnight-miqu-70b models: - model: 152334H/miqu-1-70b-sf - model: sophosympatheia/Midnight-Rose-70B-v2.0.3 base_model: 152334H/miqu-1-70b-sf merge_method: slerp parameters: t: - value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0] embed_slerp: true tokenizer_source: model:miqu-1-70b-sf dtype: float16 --- name: euryale-miqu-70b models: - model: 152334H/miqu-1-70b-sf - model: Sao10K/Euryale-1.3-L2-70B base_model: 152334H/miqu-1-70b-sf merge_method: slerp parameters: t: - value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0] embed_slerp: true tokenizer_source: model:miqu-1-70b-sf dtype: float16 --- name: winter-miqu-70b models: - model: 152334H/miqu-1-70b-sf - model: Sao10K/WinterGoddess-1.4x-70B-L2 base_model: 152334H/miqu-1-70b-sf merge_method: slerp parameters: t: - value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0] embed_slerp: true tokenizer_source: model:miqu-1-70b-sf dtype: float16 --- name: dark-miqu-70b models: - model: 152334H/miqu-1-70b-sf - model: midnight-miqu-70b - model: euryale-miqu-70b - model: winter-miqu-70b base_model: 152334H/miqu-1-70b-sf merge_method: model_stock dtype: float16 ``` ## Key configuration details: - '`merge_method: slerp`' uses spherical linear interpolation for merging models. - '`parameters: t`' controls the interpolation ratios between models. - '`embed_slerp: true`' applies slerp to the embedding layers. - '`merge_method: model_stock`' uses the '[Model Stock](https://arxiv.org/abs/2403.19522)' method. See the [Mergekit documentation](https://github.com/arcee-ai/mergekit) for more on these settings. **NOTE**: Run with `mergekit-mega` rather than `mergekit` as there are 4 documents in this one file. # Example stories The following mix of "dark" stories were generated using the Vicuna prompt format with no system message and temperature=0: