Real-time generation

#2
by mrfakename - opened

Hi,
Thanks for making this great demo!
Would you be open to adding real-time token streaming since as you mentioned in #1 generation is quite slow on CPU? Gradio supports streaming out-of-the-box, I can provide some example implementations if you're open to it.

Sure, that sounds like a great idea, but I don't think I'll have time to actually work on it until the weekend. If you're willing to give it a try, PR's are always welcome!

This comment has been hidden
mrfakename changed discussion status to closed

Sign up or log in to comment