base_model:
- 152334H/miqu-1-70b-sf
- sophosympatheia/Midnight-Rose-70B-v2.0.3
- Sao10K/Euryale-1.3-L2-70B
- Sao10K/WinterGoddess-1.4x-70B-L2
library_name: transformers
tags:
- mergekit
- merge
license: other
A "dark" creative writing model with 32k context. Based off miqu-1-70b but with greatly reduced "positivity" and "-isms". If you want happy endings, look elsewhere!
This model excels at writing Dark/Grimdark fantasy (see examples below).
Model background
This model is almost the same as Dark-Miqu-70B, but with @sophosympatheia's SLERP merge pattern:
parameters:
t:
- value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0]
which creates this truncated triangular distribution:
altered to use this truncated triangular distribution instead:
This keeps the first 16 and last 16 layers unaltered (which ties in with what people have found for the frankenmerge interleave patterns), and potentially fixes the "poor grammar" problem some people are having with Dark-Miqu-70B (sadly I can't replicate this though...).
Luckily this change also doesn't necessitate the recreation of the whole merge from scratch, and we can just use this:
merge_method: linear
parameters:
weight: 1.0
slices:
- sources:
- model: 152334H/miqu-1-70b-sf
layer_range: [0, 16]
- model: jukofyork/dark-miqu-70b
layer_range: [0, 16]
parameters:
weight: 0
- sources:
- model: jukofyork/dark-miqu-70b
layer_range: [16, 64]
- sources:
- model: 152334H/miqu-1-70b-sf
layer_range: [64, 80]
- model: jukofyork/dark-miqu-70b
layer_range: [64, 80]
parameters:
weight: 0
dtype: float16
tokenizer_source: model:miqu-1-70b-sf
Prompting format
Vicuna format is preferred:
USER: {prompt} ASSISTANT:
Mistral and Alpaca formats are also supported:
[INST] {prompt} [/INST]
### Instruction:
{prompt}
### Response:
Licence and usage restrictions
miqu-1-70b-sf is a dequantized version of the miqu-1-70b model leaked from MistralAI. All miqu-derived models, including this merge, are suitable for non-commercial, personal use only.
Mergekit configuration
The following YAML configuration was used to produce this model:
name: midnight-miqu-70b
models:
- model: 152334H/miqu-1-70b-sf
- model: sophosympatheia/Midnight-Rose-70B-v2.0.3
base_model: 152334H/miqu-1-70b-sf
merge_method: slerp
parameters:
t:
- value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0]
embed_slerp: true
tokenizer_source: model:miqu-1-70b-sf
dtype: float16
---
name: euryale-miqu-70b
models:
- model: 152334H/miqu-1-70b-sf
- model: Sao10K/Euryale-1.3-L2-70B
base_model: 152334H/miqu-1-70b-sf
merge_method: slerp
parameters:
t:
- value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0]
embed_slerp: true
tokenizer_source: model:miqu-1-70b-sf
dtype: float16
---
name: winter-miqu-70b
models:
- model: 152334H/miqu-1-70b-sf
- model: Sao10K/WinterGoddess-1.4x-70B-L2
base_model: 152334H/miqu-1-70b-sf
merge_method: slerp
parameters:
t:
- value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.4, 0.3, 0.2, 0, 0]
embed_slerp: true
tokenizer_source: model:miqu-1-70b-sf
dtype: float16
---
name: dark-miqu-70b
models:
- model: 152334H/miqu-1-70b-sf
- model: midnight-miqu-70b
- model: euryale-miqu-70b
- model: winter-miqu-70b
base_model: 152334H/miqu-1-70b-sf
merge_method: model_stock
dtype: float16
Key configuration details:
- '
merge_method: slerp
' uses spherical linear interpolation for merging models. - '
parameters: t
' controls the interpolation ratios between models. - '
embed_slerp: true
' applies slerp to the embedding layers. - '
merge_method: model_stock
' uses the 'Model Stock' method.
See the Mergekit documentation for more on these settings.
NOTE: Run with mergekit-mega
rather than mergekit
as there are 4 documents in this one file.
Example stories
The following mix of "dark" stories were generated using the Vicuna prompt format with no system message and temperature=0: