this model is great
fast and follows the characterization, great job
thanks😀
model is indeed great, but was i was unable to give it more free will, seems to me, it's a bit fixating on things, using same phrases from chat to chat
using SillyTavern
temp = 0.5
top_k = 40
top_p = 0.7
Repetition Penalty = 1.05
Presence Penalty = 0.7
Smoothing Factor = 2
Smoothing Curve = 1
Unfortunately can't say that i know exactly what i'm doing, but that worked for llama model, would you advice settings for your model?
Would appreciate i bit more extend doc on settings
But thanks anyways, you did truly good job
UPD:
after some reading changed settings to:
temp = 0.5
top_k = -1
top_p = 0.7
Repetition Penalty = 1.2
Presence Penalty = 0.3
Smoothing Factor = 0.2
Smoothing Curve = 1
that gave a much more free will, but still it asks questions on every step
you are right, This model likes to repeat some content, even though we have done DPO
it likes to ask questions, which is one of our strategies. We think this can enhance the players' freshness.
Thank you for your suggestion, we will try our best to train a better version