Temperature or top_p is not working

#35
by chintan4560 - opened

Hello,

When I am using 4bit or 8bit quantized LLaMa-2-13b models, I am getting same response from model even if I change temperature or top_p parameters for diversity. Can anyone tell me why this is happening?

I am observing the same behavior.

please enable do_sample=True along with temperature and top_p, it will work.

Sign up or log in to comment