Update README.md
Browse files
README.md
CHANGED
@@ -30,7 +30,7 @@ After processing 12 models my algorithm ended up with the following (approximate
|
|
30 |
|
31 |
There is no real logic in how these models were divided throughout the merge - Small bits and pieces were taken from each and then mixed in with other models on a layer by layer basis, using a pattern similar to my MythoMax recipe in which underlying tensors are mixed in a criss-cross manner.
|
32 |
|
33 |
-
This new process only decides on the model's layers, not the singular lm_head and embed_tokens layers which influence much of the model's output. I ran a seperate script for that, picking the singular tensors that
|
34 |
|
35 |
## Prompt Format
|
36 |
|
|
|
30 |
|
31 |
There is no real logic in how these models were divided throughout the merge - Small bits and pieces were taken from each and then mixed in with other models on a layer by layer basis, using a pattern similar to my MythoMax recipe in which underlying tensors are mixed in a criss-cross manner.
|
32 |
|
33 |
+
This new process only decides on the model's layers, not the singular lm_head and embed_tokens layers which influence much of the model's output. I ran a seperate script for that, picking the singular tensors that resulted in the longest responses, which settled on Toppy-M-7B.
|
34 |
|
35 |
## Prompt Format
|
36 |
|