jukofyork commited on
Commit
64940bf
1 Parent(s): 3055e56

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -4,11 +4,11 @@ license: other
4
 
5
  ![Deep-Miqu-103B.png](Deep-Miqu-103B.png)
6
 
7
- A "dark" creative writing model with 32k context.
8
 
9
  # Model background
10
 
11
- A `103b` parameter "frankenmerge" model created using [Mergekit](https://github.com/arcee-ai/mergekit) from my two miqu-based models: [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) and [Dawn-Miqu-70B](https://huggingface.co/jukofyork/Dawn-Miqu-70B):
12
 
13
  - To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
14
  - To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
 
4
 
5
  ![Deep-Miqu-103B.png](Deep-Miqu-103B.png)
6
 
7
+ A creative writing `103b` parameter "frankenmerge" model with 32k context.
8
 
9
  # Model background
10
 
11
+ Created using [Mergekit](https://github.com/arcee-ai/mergekit)'s `'passthrough'` method from my two miqu-based models: [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) and [Dawn-Miqu-70B](https://huggingface.co/jukofyork/Dawn-Miqu-70B):
12
 
13
  - To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
14
  - To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).