Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
posted an update Jan 22
๐—›๐—ผ๐˜„ ๐˜„๐—ฒ ๐—บ๐—ฎ๐—ฑ๐—ฒ ๐—š๐—ฟ๐—ฎ๐—ฑ๐—ถ๐—ผ ๐—ณ๐—ฎ๐˜€๐˜๐—ฒ๐—ฟ ๐—ฏ๐˜†... ๐˜€๐—น๐—ผ๐˜„๐—ถ๐—ป๐—ด ๐—ถ๐˜ ๐—ฑ๐—ผ๐˜„๐—ป!

About a month ago, @oobabooga (who built the popular text generation webui) reported an interesting issue to the Gradio team. After upgrading to Gradio 4, @oobabooga noticed that chatbots that streamed very quickly had a lag before their text would show up in the Gradio app.

After some investigation, we determined that the Gradio frontend would receive the updates from the backend immediately, but the browser would lag before rendering the changes on the screen. The main difference between Gradio 3 and Gradio 4 was that we migrated the communication protocol between the backend and frontend from Websockets (WS) to Server-Side Events (SSE), but we couldn't figure out why this would affect the browser's ability to render the streaming updates it was receiving.

After diving deep into browsers events, @aliabid94 and @pngwn made a realization: most browsers treat WS events (specifically the WebSocket.onmessage function) with a lower priority than SSE events (EventSource.onmessage function), which allowed the browser to repaint the window between WS messages. With SSE, the streaming updates would stack up in the browser's event stack and be prioritized over any browser repaint. The browser would eventually clear the stack but it would take some time to go through each update, which produced a lag.

We debated different options, but the solution that we implemented was to introduce throttling: we slowed down how frequently we would push updates to the browser event stack to a maximum rate of 20/sec. Although this seemingly โ€œslowed downโ€ Gradio streaming, it actually would allow browsers to process updates in real-time and provide a much better experience to end users of Gradio apps.

See the PR here:

Kudos to @aliabid94 and @pngwn for the fix, and to @oobabooga and @pseudotensor for helping us test it out!

Sounds very promising !

This is an incredible example of listening to user feedback and improving the product at a rapid pace. Also a great example of the amazing technical depth of the team and smart problem solving.

Interesting, I had noticed issues before and used a cruder solution of outputting "x" words at a time. I thought it was my machine and not a limitation of browsers. Good to know.