DavidAU commited on
Commit
4184c51
·
verified ·
1 Parent(s): f1fd9e7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +14 -1
README.md CHANGED
@@ -41,7 +41,9 @@ creating a 19B model with the "Abliterated" (Uncensored) version of Deepseek Qwe
41
 
42
  The model is just over 19B because of the unqiue "shared expert" (roughly 2.5 models here) used in Qwen MOEs.
43
 
44
- The oddball configuration yields interesting "thinking/reasoning" which is stronger than either 7B model on its own.
 
 
45
 
46
  Five example generations at the bottom of this page.
47
 
@@ -112,6 +114,17 @@ SOFTWARE patch (by me) for Silly Tavern (front end to connect to multiple AI app
112
 
113
  ---
114
 
 
 
 
 
 
 
 
 
 
 
 
115
  <h2>Example Generation:</h2>
116
 
117
  IQ4XS Quant, Temp 1.5, rep pen 1.06, topp: .95, minp: .05, topk: 40
 
41
 
42
  The model is just over 19B because of the unqiue "shared expert" (roughly 2.5 models here) used in Qwen MOEs.
43
 
44
+ This "oddball" configuration yields interesting "thinking/reasoning" which is stronger than either 7B model on its own.
45
+
46
+ And you can use any temp settings you want (rather than a narrow range of .4 to .8), and the model will still "think/reason".
47
 
48
  Five example generations at the bottom of this page.
49
 
 
114
 
115
  ---
116
 
117
+ Known Issues:
118
+
119
+ ---
120
+
121
+ From time to time model will generate some Chinese symbols/characters, especially at higher temps. This is normal
122
+ for DeepSeek Distill models.
123
+
124
+ Reasoning/Thinking may be a little "odd" at temps 1.5+ ; you may need to regen to get a better response.
125
+
126
+ ---
127
+
128
  <h2>Example Generation:</h2>
129
 
130
  IQ4XS Quant, Temp 1.5, rep pen 1.06, topp: .95, minp: .05, topk: 40