DavidAU commited on
Commit
6683d08
1 Parent(s): 8842e99

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -20
README.md CHANGED
@@ -45,23 +45,19 @@ pipeline_tag: text-generation
45
  I took the original models in "L3-Stheno-Maid-Blackroot 8B" and completely rebuilt it a new pass-through merge (everything preserved)
46
  and blew it out to over 16.5 billion parameters - 642 tensors, 71 layers (8B original has 32 layers) at full float 32 precision.
47
 
48
- However that is where all similarity ends.
49
 
50
- I build TWO custom Llama3 models:
51
-
52
- Grand Horror 16.5B ( <A href="https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF"> here </a> ) and Grand Story 16.5B
53
- then merged these together with a "smoothing step" captured at F32 precision.
54
 
55
  (formula below, along with critical merge model notes and theory)
56
 
57
- The result is a model that is far more stable, far more capable than any of the 3 models originally nor it's the "sum" of 2 16.5B models.
58
-
59
- Compared to Grand Horror 16.5B is it over 25000 points lower (IQ4XS) in perplexity (lower is better) or 2.5 full levels of magnitude lower.
60
 
61
  It is tougher, stronger and can handle a far wider range of operating conditions - from temp .1 to temp 5 all day long.
62
 
63
- I tried for hours to get it to break, sweat or at least fart - no go.
64
-
65
  The F32 precision (along with full F32 transfer to the ggufs) increases the performance even further.
66
 
67
  This added precision increases the model's depth and nuance including "world" perception, real time in the moment
@@ -76,18 +72,12 @@ just about... any general fiction activity "AI guru" including scene generation
76
 
77
  This model is capable of horror, science fiction, romance - you name it.
78
 
79
- But I would not suggest "children's stories".
80
-
81
  This model has a very strong VIDIDINESS bias. It generates extremely vivid prose, description, and dialog as well
82
- as in the moment metaphors and similes.
83
-
84
- It rarely uses "cliches".
85
 
86
  It also has a STRONG horror bias, although it will generate content for almost any genre. That being said
87
  if there is a "hint" of things going wrong... they will.
88
 
89
- In "romance" ... let's just say it very vivid, intense and graphic - R18. (not horror)
90
-
91
  It will also swear (R-18) like there is no tomorrow at times and "dark" characters will be VERY dark so to speak.
92
 
93
  Model excels in details (real and "constructed"), descriptions, similes and metaphors including dates, times
@@ -98,9 +88,6 @@ I would also say it can have a sense of humor ... ah... dark humor.
98
  With all this being said, this model has an uncanny sense of "there" , "in the moment" and timing too.
99
  This single quality sets it apart from other models in my opinion.
100
 
101
- Although it swears to the point of pealing paint off the wall and goes "scorched Earth graphic horror" at the drop of a pin the
102
- single quality noted is worth it.
103
-
104
  Another way to put this: It does not sugar coat ANYTHING - positive or negative.
105
 
106
  These can be filtered / controlled to some degree in your prompts.
 
45
  I took the original models in "L3-Stheno-Maid-Blackroot 8B" and completely rebuilt it a new pass-through merge (everything preserved)
46
  and blew it out to over 16.5 billion parameters - 642 tensors, 71 layers (8B original has 32 layers) at full float 32 precision.
47
 
48
+ Using these three models I build TWO custom Llama3 models:
49
 
50
+ Grand Horror 16.5B ( <A href="https://huggingface.co/DavidAU/L3-Stheno-Maid-Blackroot-Grand-HORROR-16B-GGUF"> here </a> ) and
51
+ Grand Story 16.5B ALPHA (unreleased) then merged these together with a "smoothing step" captured at F32 precision.
 
 
52
 
53
  (formula below, along with critical merge model notes and theory)
54
 
55
+ The result is a model that is far more stable, far more capable than any of the 3 models originally and it is
56
+ greater than the "sum" of two 16.5B models mentioned. Compared to Grand Horror 16.5B is it over 25000 points
57
+ lower (IQ4XS) in perplexity (lower is better) or 2.5 full levels of magnitude lower.
58
 
59
  It is tougher, stronger and can handle a far wider range of operating conditions - from temp .1 to temp 5 all day long.
60
 
 
 
61
  The F32 precision (along with full F32 transfer to the ggufs) increases the performance even further.
62
 
63
  This added precision increases the model's depth and nuance including "world" perception, real time in the moment
 
72
 
73
  This model is capable of horror, science fiction, romance - you name it.
74
 
 
 
75
  This model has a very strong VIDIDINESS bias. It generates extremely vivid prose, description, and dialog as well
76
+ as in the moment metaphors and similes and it rarely uses "cliches".
 
 
77
 
78
  It also has a STRONG horror bias, although it will generate content for almost any genre. That being said
79
  if there is a "hint" of things going wrong... they will.
80
 
 
 
81
  It will also swear (R-18) like there is no tomorrow at times and "dark" characters will be VERY dark so to speak.
82
 
83
  Model excels in details (real and "constructed"), descriptions, similes and metaphors including dates, times
 
88
  With all this being said, this model has an uncanny sense of "there" , "in the moment" and timing too.
89
  This single quality sets it apart from other models in my opinion.
90
 
 
 
 
91
  Another way to put this: It does not sugar coat ANYTHING - positive or negative.
92
 
93
  These can be filtered / controlled to some degree in your prompts.