Feature Request: Cancel streaming

#137
by Bilgames - opened

Hello,
Firstly many thanks for working on this project, it is great for the open source community.
As far as I can tell, there is no easy way to cancel text generation once started; doing so is pretty useful for chat applications and other implementations are at least raising this issue: https://github.com/Maximilian-Winter/llama-cpp-agent/issues/47

My idea would be to stop searching for new tokens on some conditions like a boolean flag, any ideas for implementing this?

Using LM Studio to run models works for me. I often stop the generation, edit the AI mistakes and steer it in the direction I want, save the changes and then have it continue generating. This seems to work for me on all models I have tried while using LM Studio App.

Sign up or log in to comment