mikeee/mpt-30b-chat · empty responses from LLM

Regarding this post:
https://github.com/abacaj/mpt-30B-inference/issues/5

"I'm having the same problem. Processing goes to 100% for a few seconds but returns empty answers. It goes around 24Gb of RAM usage.
I tested in VScode and in cmd. Same behavior.
Ive tried to debug, but the "generator" variable had no kind of string text inside it.

I'm running mpt-30b-chat.ggmlv0.q5_1.bin model instead of default q4_0.

PC: Ryzen 5900X and 32 Gb RAM."

I still have the empty responses, using your implementation with gradio. Any guess on why this happens?

Great work, thanks in advance!