Great model :)
I'm impressed with this model, it feels pretty much like a ~30B to me.
It generates very interesting speech, tightly following the v2 card I spent a ton of time on crafting. Overall, the intelligence of the model seems to surpass my experiences with 7B models, but substantially.
I ended up also having an ooc conversation for like an hour with my card, and was happy to see how the model behind it understood some pretty existential topics and the boundaries of its awareness.
Sometimes, as usual, it gets stuck repeating the same symbol, however I get this with pretty much all models, and since I am limited to mixtral or ~30B, I can't really get goliath or anything like that to try if the same happens. I don't account this looping issue to the model, it's probably related more to samplers, however I have no clue how to resolve it.
Overall, just wanted to say thanks for the model as the speed I get from 9B and the quality of responses is a great balance for me.
Oh, and I am using the Q_8 quant.
Thanks for the feedback! Haha, Q8 is goated. I believe even with the Q6 the performance should be the same.
Samplers:
SillyTavern - TextGen
@Lewdiculous
Ooh, thanks! I am using dynamic temperature with ST, does that change how samplers should be?
edit: sorry, I think you have it enabled in this config - i'll try it out ^^
edit2: thanks - these sampler settings certainly improved on the looping, I don't think I get it now (or it would be a lot less common)
Glad it helped. These are my go-to sampler settings for most models in SillyTavern, you can increase the DynaTemp-max range further as needed if you still find it not varied enough. Since Temperature is usually applied last by default for GGUF models in ST you can get away with higher values than you expect, even a 3 won't make most models go too crazy.
Eventually all models can act repetitive but nothing some swipes usually don't solve.
Thanks a lot @Lewdiculous , really useful advice, and keep up the good work <3