Can you share the server code for local deploy?

#4
by merlinarer - opened

Thanks for the nice work! I want to deploy the chat model in my GPUs with your palyground, while I fail to process the stream properly. Can you share the server code that process the prompt and return stream ?
I use the following code:

    output = ""
    stream = pipe(prompt)
    for idx, response in enumerate(stream):
        output += response['generated_text'].replace(prompt, '')
        if idx == 0:
            history.append(" " + output)
        else:
            history[-1] = output
        chat = [(history[i].strip(), history[i + 1].strip()) for i in range(0, len(history) - 1, 2)]
        yield chat, history, user_message, ""

while it can only respose in first time and got nothing after that. I check it and find that, everytime the pipe just generate a /n after prompt and that is why user got nothing .

Sign up or log in to comment