Update README.md
Browse files
README.md
CHANGED
@@ -10,7 +10,7 @@ license: cc-by-nc-4.0
|
|
10 |
---
|
11 |
# kunoichi-lemon-royale-v2-32K-7B
|
12 |
|
13 |
-
This merge amounts to the grafting of a model derived from Mistral v0.1 (4K sliding window context, to a maximum of 8K practical context length) onto a model derived from Mistral v0.2 (32K context length). It appears to work, although rope_theta in config.json was lowered from 1000000.0 to
|
14 |
|
15 |
In light testing, this model appears to follow formatting very well, with temperature 1.0 and minP 0.01, using ChatML prompts, even though the underlying model claims to follow Alpaca prompts.
|
16 |
|
|
|
10 |
---
|
11 |
# kunoichi-lemon-royale-v2-32K-7B
|
12 |
|
13 |
+
This merge amounts to the grafting of a model derived from Mistral v0.1 (4K sliding window context, to a maximum of 8K practical context length) onto a model derived from Mistral v0.2 (32K context length). It appears to work, although rope_theta in config.json was lowered from 1000000.0 to 100000.0, which works well enough to 16K.
|
14 |
|
15 |
In light testing, this model appears to follow formatting very well, with temperature 1.0 and minP 0.01, using ChatML prompts, even though the underlying model claims to follow Alpaca prompts.
|
16 |
|