Responses from model are showing to other users

#18
by djstrong - opened
ZeroGPU Explorers org
edited Apr 18

We have LLM chat https://huggingface.co/spaces/speakleash/Bielik-7B-Instruct-v0.1 When popular youtuber published video about our model, many users started using this space. We and users observed that the interface shows responses for questions from other users.
I don't think it is a bug in our code - it is quite standard, similar to other chat spaces.

One of the comments from YouTube: "they have a bug with this model. I noticed that sometimes replies are sent to the wrong recipients. I had a situation where I sat and watched the replies generated by people for 15 minutes (in the place where the reply to me should have been)."

ZeroGPU Explorers org

@djstrong why do you have https://huggingface.co/spaces/speakleash/Bielik-7B-Instruct-v0.1/blob/main/app.py#L132 in your code in the GPU function? Try removing any stateful operation that doesn't require the GPU outside the function that is decorated with @spaces.GPU

ZeroGPU Explorers org

@codelion Thank you! How to escape from the @spaces.GPU function? Run it in a thread or it still will be inside GPU scope? We need to save answers from the model.

ZeroGPU Explorers org

So, the main function predictinvoke without @spaces.GPU and inside call model with @spaces.GPU.

@spaces.GPU # the only function with the decorator
def generate_response():
  ...

def predict(message, history, system_prompt, temperature, max_new_tokens, top_k, repetition_penalty, top_p):
  prepare_data()
  yield from generate_response()
  save_results()
ZeroGPU Explorers org

@codelion Thank you! I have optimized the code.

ZeroGPU Explorers org

@djstrong May I check what was at L132 that caused the problem? I saw the print statement and wonder if we cannot print anything in the GPU function.

ZeroGPU Explorers org

No you can print stuff it will be routed to the logs

ZeroGPU Explorers org

@djstrong May I check what was at L132 that caused the problem? I saw the print statement and wonder if we cannot print anything in the GPU function.

There was sending logs to Discord and a repository. However, I don't think it was causing the main problem and @codelion comment was "by the way".

Sign up or log in to comment