jukofyork
/

Deep-Miqu-103B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

jukofyork commited on May 14

Commit

1d09ed1

•

1 Parent(s): 32a2680

Update README.md

Files changed (1) hide show

README.md +5 -2

README.md CHANGED Viewed

@@ -18,6 +18,9 @@ Created using [Mergekit](https://github.com/arcee-ai/mergekit) from my two `70b`
 - To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
 - To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
 # Prompting format
 Vicuna format is preferred:
@@ -48,8 +51,8 @@ Mistral and Alpaca formats are also supported:
 The following YAML configuration was used to produce this model:
 ```yaml
-const_tag: &MODEL1 jukofyork/dawn-miqu-70b
-const_tag: &MODEL2 jukofyork/dark-miqu-70b
 const_tag: &QK_ATTENUATION_FACTOR 0.8408964153  # sqrt(sqrt(1/2))
 const_tag: &MLP_DOWN_SCALE_FACTOR 0.7071067812  # sqrt(1/2)

 - To fix problems with "backwards time skips" in the generated stories, the "standard" interleave pattern was replaced by repeated blocks (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2081174251)).
 - To help maintain cohesion, the '`q_proj`', '`k_proj`' and '`down_proj`' tensors were all scaled to hypothesised upper-bound values (see [here](https://github.com/arcee-ai/mergekit/issues/198#issuecomment-2063716974)).
+My hope was this would act like a longer-context version of [goliath-120b](https://huggingface.co/alpindale/goliath-120b), as [Dawn-Miqu-70B](https://huggingface.co/jukofyork/Dawn-Miqu-70B) has a lot of [
+Xwin-LM-70B-V0.1 ](https://huggingface.co/Xwin-LM/Xwin-LM-70B-V0.1) in it and [Dark-Miqu-70B](https://huggingface.co/jukofyork/Dark-Miqu-70B) has [Euryale-1.3-L2-70B](https://huggingface.co/Sao10K/Euryale-1.3-L2-70B) in it.
 # Prompting format
 Vicuna format is preferred:
 The following YAML configuration was used to produce this model:
 ```yaml
+const_tag: &MODEL1 jukofyork/Dawn-Miqu-70B
+const_tag: &MODEL2 jukofyork/Dark-Miqu-70B
 const_tag: &QK_ATTENUATION_FACTOR 0.8408964153  # sqrt(sqrt(1/2))
 const_tag: &MLP_DOWN_SCALE_FACTOR 0.7071067812  # sqrt(1/2)