如何流式输出

#35
by DeyangKong - opened

想要流式输出,修改代码为
response = model.chat(tokenizer, messages, stream=True)
print(response) 的输出为
<transformers_modules.Baichuan2-13B-Chat.generation_utils.TextIterStreamer object at 0x7fafd21aa4d0>
请问该如何流式输出结果呢,谢谢

也是摸索了老半天,核心思路就是每次输出比上一次的输出更新的内容
这样就可以流式输出了:

messages = []
messages.append({"role": "user", "content": "解释一下“温故而知新”"})
streamer = model.chat(tokenizer, messages, stream=True)
last_str = ""
for stream_output in streamer:
    print(stream_output[len(last_str):], end="")
    last_str = stream_output

更进一步,多轮对话形式的流式输出如下:

messages = [{"role": "system", "content": "这里写一些背景情况"}]
while True:
    input_str = input("User: ")
    if input_str == "stop":
        print("Assistant: 再见!")
        break
    messages.append({"role": "user", "content": input_str})
    streamer = model.chat(tokenizer, messages, stream=True)
    print("Assistant: ", end="")
    last_str = ""
    for stream_output in streamer:
        print(stream_output[len(last_str):], end="")
        last_str = stream_output
    print()
    messages.append({"role": "assistant", "content": last_str})

按照huggingface上面关于generate的方法中介绍的stream的用法,好像是得开一个线程让模型去生成,然后同步调用streamer产生输出,这样才是实时的流式输出。

Sign up or log in to comment