Update README.md
Browse files
README.md
CHANGED
@@ -18,6 +18,9 @@ Created using [Mergekit](https://github.com/arcee-ai/mergekit) from my two `70b`
|
|
18 |
- To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
|
19 |
- To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
|
20 |
|
|
|
|
|
|
|
21 |
# Prompting format
|
22 |
|
23 |
Vicuna format is preferred:
|
@@ -48,8 +51,8 @@ Mistral and Alpaca formats are also supported:
|
|
48 |
The following YAML configuration was used to produce this model:
|
49 |
|
50 |
```yaml
|
51 |
-
const_tag: &MODEL1 jukofyork/
|
52 |
-
const_tag: &MODEL2 jukofyork/
|
53 |
|
54 |
const_tag: &QK_ATTENUATION_FACTOR 0.8408964153 # sqrt(sqrt(1/2))
|
55 |
const_tag: &MLP_DOWN_SCALE_FACTOR 0.7071067812 # sqrt(1/2)
|
|
|
18 |
- To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
|
19 |
- To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
|
20 |
|
21 |
+
My hope was this would act like a longer-context version of [goliath-120b](https://huggingface.co/alpindale/goliath-120b), as [Dawn-Miqu-70B](https://huggingface.co/jukofyork/Dawn-Miqu-70B) has a lot of [
|
22 |
+
Xwin-LM-70B-V0.1 ](https://huggingface.co/Xwin-LM/Xwin-LM-70B-V0.1) in it and [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) has [Euryale-1.3-L2-70B](https://huggingface.co/Sao10K/Euryale-1.3-L2-70B) in it.
|
23 |
+
|
24 |
# Prompting format
|
25 |
|
26 |
Vicuna format is preferred:
|
|
|
51 |
The following YAML configuration was used to produce this model:
|
52 |
|
53 |
```yaml
|
54 |
+
const_tag: &MODEL1 jukofyork/Dawn-Miqu-70B
|
55 |
+
const_tag: &MODEL2 jukofyork/Dark-Miqu-70B
|
56 |
|
57 |
const_tag: &QK_ATTENUATION_FACTOR 0.8408964153 # sqrt(sqrt(1/2))
|
58 |
const_tag: &MLP_DOWN_SCALE_FACTOR 0.7071067812 # sqrt(1/2)
|