Output Gibberish when inferenced by mulitple user

by muhammadfhadli - opened

I use guanaco-13b-uncensored.Q5_K_M.gguf to make a chatbot using streamlit. when there's only 1 user, it works perfectly fine. but when multiple user use the chatbot simultaneously (2 users and more). Suddenly, it start generate gibberish. have you experience this? I don't think its related to the llama-ccp version or any packages right? or maybe gguf model is just not that thread-friendly? Here is my video when i simulated 2 user using the chatbot. Please take a look if you dont mind. Thank you

Sign up or log in to comment