ST Settings?

#1
by SaisExperiments - opened

Is instruct necessary?
I think the chat format is the same as gemma.
Do you have recommended samplers?
I've had trouble with past gemma2 models 😶‍🌫️

Spare account ^
I got it to work with the presets in this repo crestf411/gemma2-9B-sunfall-v0.5.2
Samplers don't seem to matter much, it's good with "Naive" (0.7 temp) to "Universal Light" (1.25 temp) and almost anything inbetween.
It's also the first model i've ever had call me a "daft cunt" so 5/5⭐
It plays into characters cards suprisingly well, does well with accents and has a deep understading of the situation / its more situationally aware.
Tested with an IQ4_XS quant using Kobold.CPP_FrankenFork_v1.68Zi and K cache quantization "4016/Kq4_0-Vf16 (10.25BPW), no-FA" with 8K context.

12288 Context is possible in 8GB of vram with IQ4_XS, batch size 128 and "4016/Kq4_0-Vf16 (10.25BPW), no-FA" 7.7GB of vram usage and still maintains 25T/s

image.png

I haven't gotten to do a huge amount of testing but it's promising so far!

Sign up or log in to comment