Gryphe commited on
Commit
e2a33b3
1 Parent(s): 763e27d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -30,7 +30,7 @@ After processing 12 models my algorithm ended up with the following (approximate
30
 
31
  There is no real logic in how these models were divided throughout the merge - Small bits and pieces were taken from each and then mixed in with other models on a layer by layer basis, using a pattern similar to my MythoMax recipe in which underlying tensors are mixed in a criss-cross manner.
32
 
33
- This new process only decides on the model's layers, not the singular lm_head and embed_tokens layers which influence much of the model's output. I ran a seperate script for that, picking the singular tensors that create the longest responses, which settled on Toppy-M-7B.
34
 
35
  ## Prompt Format
36
 
 
30
 
31
  There is no real logic in how these models were divided throughout the merge - Small bits and pieces were taken from each and then mixed in with other models on a layer by layer basis, using a pattern similar to my MythoMax recipe in which underlying tensors are mixed in a criss-cross manner.
32
 
33
+ This new process only decides on the model's layers, not the singular lm_head and embed_tokens layers which influence much of the model's output. I ran a seperate script for that, picking the singular tensors that resulted in the longest responses, which settled on Toppy-M-7B.
34
 
35
  ## Prompt Format
36