huggingchat/chat-ui · HuggingChat Performance Issues: Slow Responses and Loading Problems

May 15

The current models in Huggingchat are having problems, almost all the feedback models for a long time, or it keeps downloading like that without displaying the answer, I just want to know that there are problems that are happening for several days for hugingchat ??

mattasdata

May 20

Likewise, I am getting 502 errors when trying to use a variety of different models. E.g. deepseek-R1 starts the reasoning part, but then errors out on 502 before returning the final response.

nsarrazin

May 20

@mattasdata can you get me a list of models that have issues ? ideally with some shared conversations :)

mattasdata

May 21

Unfortunately, the "share conversation" feature also isn't working. However, the models I have tested are deepseek-ai/DeepSeek-R1-Distill-Qwen-32B and Qwen/Qwen3-235B-A22B. Both return 502 or say Error in input stream. I was able to download the JSON from one conversation below:

{"prompt":"Prompt generation failed","model":"deepseek-ai/DeepSeek-R1-Distill-Qwen-32B","parameters":{"return_full_text":false},"messages":[{"role":"system","content":"","createdAt":"2025-05-20T13:56:20.983Z","updatedAt":"2025-05-20T13:56:20.983Z"},{"role":"user","content":"How would I enforce the signature of a function in python? I want to return a Callable object from a factory method, and that Callable ideally would have a way of indicating what the parameter names should be for it. In code:\n\n```python\ndef factory_method(self):\n    def callable_to_return(param_1: list, param_2: int):\n        # this function is the one I want to define an interface for, so that param_1 and param_2 should be used as the argument names\n        return (param_1, param_2)\n    return callable_to_return\n```","createdAt":"2025-05-20T13:56:21.409Z","updatedAt":"2025-05-20T13:56:21.409Z","files":[]}]}

nsarrazin

May 21

•

edited May 21

~~I've been trying to reproduce but so far I'm not getting a 502. https://hf.co/chat/r/ixaxEWn?leafId=d096e2e9-d689-41f6-a4fe-313b80d1bb91~~

Does this happen every time for you or only occasionally? Also what browser/OS are you using ?

EDIT: I can reproduce it now, investigating, thanks for reporting!

nsarrazin

May 21

Found the issue! Working on a fix.

nsarrazin

May 21

Ok the fix is currently deploying, should be live in 5-10minutes.

Looks like the small model we use for tasks like giving reasoning status updates, conversation summaries, etc. was occasionally overloaded. When it happened it crashed chat-ui! This should now be handled properly.

When https://huggingface.co/chat/settings/application shows Latest deployment 5c0c578 then you will know it's live!

mattasdata

May 21

Awesome, thanks for getting to that so quickly! I can confirm it has been resolved (after a page refresh) using the Qwen model.

wolf19

May 23

Bonjour, nous avons réalisé un devis pour une réfection à neuf de la toiture courant 2006. Les travaux de peinture à l'intérieur du bâtiment seront réalisés en interne par le service MMT.

johugg

May 28

Hi 🤗
I have also had the problem here for days that the good HuggingChat is very, very slow and keeps crashing. I found this thread here and checked my version. It says ‘Latest deployment 41a4fde’. That should be older than ‘5c0c578’. How can I get this newer version? I have already reloaded the page using CTRL+F5 (Firefox), but nothing changes. Does anyone here have a tip for me?

nsarrazin

Jun 10

Hi @johugg is this still occurring ?

johugg

Jun 11

@nsarrazin Oh, thanks for asking 🙂 I've just checked and there is indeed a new version number for me: Latest deployment 6d2f047.
It may be that the crashes have been a little less in the last few days. But I'm not so sure, as I haven't used the HF chat as much in the last few days as I did the time before. 🤔
Another thing I've noticed recently is that the buttons for the community tools (HF docs, Python code, etc.) have been removed. I can't seem to reactivate them either. But that's perhaps another topic 😉

nsarrazin

Jun 11

Issue with tools should also be fixed! Let me know if you still have issues with performance

johugg

Jun 14

@nsarrazin Yes, the tools are back. Thank you very much 🙂 At the moment the chat seems to be working fine again. I have the impression that the ‘glitches’ depend on the time of day and also on the LLM I'm using. The system seems to work better at the weekend or in the morning. The answers are quicker and better prepared at this time. In the evening hours, an unspecific error occurs from time to time and the chat output sometimes becomes ‘funny’. The chat then jumps from one language to another in the middle of a sentence and Chinese(?) characters suddenly appear in the text. Or there are lines with meaningless content: "+++++++++++++++++++++++" 😉
Depending on which LLM I use, the response times are also very different. For example, according to my observations, the Qwen/Qwen3-235B-A22B model takes an extremely long time to even start its considerations. Other models are sometimes much faster. But I suspect that this is in the nature of technology 😉
These are my current observations. Perhaps they will help ...

deleted

Jun 15

Likewise, I am getting 502 errors when trying to use a variety of different models. E.g. deepseek-R1 starts the reasoning part, but then errors out on 502 before returning the final response.

i am too

nsarrazin

Jun 24

Issues with the 502 should be fixed, let me know if it still happens.

Tungpeka changed discussion status to closed Jul 2