Thanks. It's working fine

#1
by Alrat233 - opened

I used both 2.4bpw and 2.18bpw of this model for comparison because my 3090 was running low on memory using 2.4bpw.I found that with 2.18bpw, after multiple rounds of dialogue, it was easy to repeat the dialogue, such as "go to your bedroom", which would repeat the meaning three times with three different sentence patterns.My repeated penalty setting is 1.15, using the L2 recommended setting.
Finally, I like the model after quantization very much. Compared with the 20B model, the 70B model can really understand what is the perspective of {{char}} and maintain its own role in role-playing, and will no longer speak or act for {{user}} at will. The use effect is very good

Sign up or log in to comment