Spaces:
Runtime error
Runtime error
empty responses from LLM
#1
by
rodrigofarias
- opened
Regarding this post:
https://github.com/abacaj/mpt-30B-inference/issues/5
"I'm having the same problem. Processing goes to 100% for a few seconds but returns empty answers. It goes around 24Gb of RAM usage.
I tested in VScode and in cmd. Same behavior.
Ive tried to debug, but the "generator" variable had no kind of string text inside it.
I'm running mpt-30b-chat.ggmlv0.q5_1.bin model instead of default q4_0.
PC: Ryzen 5900X and 32 Gb RAM."
I still have the empty responses, using your implementation with gradio. Any guess on why this happens?
Great work, thanks in advance!
I really don't have any idea.
But I started a q5_1 at https://huggingface.co/spaces/mikeee/mpt-30b-chat-gglm-5bit
It seems to be running alright.