Model taking too much time

#7
by kanwalkhalid - opened

I am using it's q_4 quantization to generate email
and it took too much time to generate completion tokens. What should I have to do?

Sign up or log in to comment